Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Iterative Dvc Experiment Result Capture

From Leeroopedia


Knowledge Sources
Domains Experiment_Management, Version_Control
Last Updated 2026-02-10 00:00 GMT

Overview

Experiment result capture is the practice of saving the complete state of an experiment -- including pipeline outputs, metrics, and parameters -- as a Git-versioned snapshot under a dedicated ref namespace.

Description

After an experiment has been executed, its results exist only as transient state in the working directory or a temporary execution environment. Without explicit capture, these results are lost as soon as the workspace changes. Experiment result capture solves this problem by persisting the full experiment state as a Git commit stored under the refs/exps/ namespace, creating a durable, addressable record of the experiment that can be retrieved, compared, or promoted at any later time.

The capture process operates independently of the main branch history. Unlike a regular Git commit on a feature branch, an experiment commit is stored as a detached ref under refs/exps/{baseline_sha}/{exp_name}. This design keeps the main branch clean -- experiments do not appear in git log or pollute the commit graph -- while still leveraging Git's content-addressable storage for integrity and deduplication. Each experiment commit contains the complete working tree at the time of capture: parameter files, metric outputs, model artifacts (tracked by DVC), and pipeline definitions.

The capture mechanism supports both post-execution capture (saving results after a pipeline run) and in-place capture (saving the current workspace state as-is, without re-running the pipeline). In-place capture is particularly useful when a user has manually tuned parameters and wants to snapshot the current state without the overhead of pipeline reproduction. Both modes produce the same output: a commit SHA that uniquely identifies the experiment state.

Usage

Use experiment result capture when:

  • You have completed an experiment run and want to persist the results for later comparison
  • You want to save the current workspace as a named experiment without re-running the pipeline
  • You need to create experiment snapshots that can be shared, pushed to remote, or compared
  • You want to maintain a history of experiment results without cluttering the main Git branch
  • You need to capture results from ad-hoc or interactive experimentation sessions

This technique is the design trigger whenever experiment results need to outlive the execution session and be addressable by a stable identifier.

Theoretical Basis

Experiment result capture follows a snapshot-and-ref pattern built on Git's object model:

function capture_experiment(workspace, name, baseline_rev):
    # Step 1: Stage all relevant files
    tracked_files = get_dvc_tracked_files(workspace)
    metric_files = get_metric_files(workspace)
    param_files = get_param_files(workspace)

    stage_files(tracked_files + metric_files + param_files)

    # Step 2: Optionally include untracked files
    if include_untracked:
        stage_files(untracked_paths)

    # Step 3: Create commit
    commit_sha = git_commit(
        message=name or auto_generated_message,
        parent=baseline_rev
    )

    # Step 4: Store under experiment ref namespace
    ref_path = "refs/exps/{baseline_rev}/{name}"
    git_update_ref(ref_path, commit_sha)

    return commit_sha

The ref namespace hierarchy encodes the experiment's lineage:

refs/exps/
  {baseline_sha_1}/
    {experiment_name_a}  -> commit_sha_x
    {experiment_name_b}  -> commit_sha_y
  {baseline_sha_2}/
    {experiment_name_c}  -> commit_sha_z

This structure supports efficient queries such as "show all experiments derived from baseline X" by listing refs under refs/exps/{baseline_sha}/.

Key theoretical properties:

  1. Immutability: Once captured, an experiment commit is a fixed Git object; its SHA guarantees content integrity
  2. Addressability: Every experiment has a unique ref that can be used for checkout, diff, or push operations
  3. Isolation: Experiment refs do not appear in git log on any branch, maintaining a clean development history
  4. Composability: Captured experiments can be promoted to branches (dvc exp branch), applied to the workspace (dvc exp apply), or pushed to remote repositories (dvc exp push)

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment