Principle:Iterative Dvc Experiment Tracking
| Knowledge Sources | |
|---|---|
| Domains | API, Experiment_Management |
| Last Updated | 2026-02-10 00:00 GMT |
Overview
Experiment tracking is the programmatic management of the experiment lifecycle -- creation, parameter setting, metric logging, and result persistence -- through a public API, enabling scripts and applications to record experiment state without relying on command-line invocation or manual file management.
Description
Machine learning and data science workflows involve running many experiments with different configurations and measuring their outcomes. While interactive command-line tools provide one interface for this process, many workflows require programmatic control: training scripts that log metrics during execution, automated hyperparameter sweeps that create experiments in loops, and integration tests that verify experiment behavior. Experiment tracking through a public API provides this programmatic interface, allowing any code to participate in the experiment lifecycle.
The experiment lifecycle managed by the API follows a structured sequence. First, an experiment is initialized within the context of a repository, establishing a new experiment namespace. Next, parameters are set -- either by modifying parameter files or by passing parameter overrides -- to define the experiment's configuration. The experiment's code then executes (typically a training or processing pipeline), during which metrics are logged as they become available. Finally, the experiment is saved, which persists its complete state -- parameters, metrics, code changes, and data references -- as a Git commit within the experiment ref namespace.
This API-driven approach decouples experiment management from any particular user interface or execution environment. The same API calls work whether the experiment is run locally in a notebook, remotely on a training cluster, or as part of a continuous integration pipeline. By providing a stable programmatic interface, the system enables the construction of higher-level tooling -- experiment schedulers, automated reporting systems, and model selection pipelines -- on top of the core experiment tracking primitives.
Usage
Experiment tracking via the API is used whenever:
- A training script needs to log metrics progressively during model training.
- An automated hyperparameter sweep creates and saves many experiments in a loop.
- A notebook-based workflow initializes an experiment, runs cells, and saves results programmatically.
- A CI/CD pipeline runs experiments as part of an automated testing or validation workflow.
- A custom orchestration tool manages experiment lifecycle across multiple repositories or environments.
Theoretical Basis
Lifecycle state machine. The experiment tracking API models each experiment as a finite state machine with well-defined transitions:
States: [Uninitialized] -> [Active] -> [Completed] -> [Persisted]
Transitions:
init() : Uninitialized -> Active
Creates experiment context, sets baseline revision
set_params() : Active -> Active
Modifies parameter values within the active experiment
log_metric() : Active -> Active
Records a metric name-value pair
make_checkpoint(): Active -> Active (optional)
Saves intermediate state as a checkpoint commit
save() : Active -> Persisted
Commits all state (params, metrics, code) to experiment ref
discard() : Active -> Uninitialized
Abandons the experiment without persisting
The state machine ensures that operations are invoked in valid order -- for example, metrics cannot be logged before initialization, and saving an already-persisted experiment is a no-op. This design prevents common errors in experiment management and provides clear semantics for each API call.
Experiment isolation through Git refs. Each experiment is persisted as a Git commit on a dedicated ref (e.g., refs/exps/...) rather than on a user-visible branch. This ref-based isolation pattern ensures that experiments do not pollute the main branch history while still leveraging Git's existing infrastructure for storage, deduplication, and distributed synchronization:
Repository ref structure:
refs/heads/main -> production code
refs/exps/baseline/exp1 -> experiment 1 (params: lr=0.01, acc=0.92)
refs/exps/baseline/exp2 -> experiment 2 (params: lr=0.001, acc=0.94)
refs/exps/baseline/exp3 -> experiment 3 (params: lr=0.1, acc=0.87)
Each experiment ref points to a commit containing:
- Modified parameter files
- Generated metric files
- Updated lock files
- Any code changes made during the experiment
This approach provides the atomicity and durability guarantees of Git commits while maintaining a clean separation between production history and experimental exploration.