Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Workflow:Iterative Dvc Experiment Tracking

From Leeroopedia


Knowledge Sources
Domains Experiment_Management, MLOps, ML_Pipelines
Last Updated 2026-02-10 10:30 GMT

Overview

End-to-end process for running, tracking, comparing, and managing machine learning experiments using DVC's Git-based experiment versioning system, enabling rapid iteration on parameters and code without manual branch management.

Description

This workflow covers DVC's experiment tracking system, which allows data scientists to run multiple variations of ML pipelines with different parameters, compare results, and selectively apply successful experiments. Each experiment is stored as a lightweight Git reference (custom ref under `refs/exps/`), enabling hundreds of experiments without polluting the branch namespace. The system supports immediate execution, queued batch execution via Celery, parameter sweeps via Hydra integration, and collaboration through experiment push/pull to Git remotes.

Goal: A set of tracked experiment results with metrics, parameters, and artifacts that can be compared, applied, or shared.

Scope: From parameter modification through pipeline execution to experiment comparison and selection.

Strategy: Git-based experiment isolation with lightweight refs, optional Celery-based queuing for batch execution, and Hydra integration for parameter sweeps.

Usage

Execute this workflow when:

  • You want to try different hyperparameters without creating Git branches manually
  • You need to run and compare multiple training configurations systematically
  • You want to queue a batch of experiments for sequential or parallel execution
  • You need to share experiment results with team members via Git remotes
  • You want to perform a parameter sweep across multiple values

Execution Steps

Step 1: Define Parameter Overrides

Specify which parameters to vary for the experiment. Parameters can be overridden on the command line using the `-S` / `--set-param` flag, which modifies values in `params.yaml` or other parameter files before execution. Hydra sweep syntax is supported for generating multiple experiment configurations from a single command.

Key considerations:

  • Parameter overrides use `key=value` syntax with dot-notation for nested keys
  • Hydra sweep syntax (e.g., `learning_rate=0.001,0.01,0.1`) generates multiple experiments
  • Sweeps must be queued; immediate execution of sweeps is not supported
  • The `--no-hydra` flag disables Hydra integration for raw parameter passing

Step 2: Prepare Experiment Workspace

DVC creates an isolated workspace for the experiment. For immediate execution, a temporary directory is used (controlled by `--temp` flag). The current workspace state, including any untracked files specified via `--copy-paths`, is replicated into the isolated environment. A new Git commit is created on a detached HEAD to capture the experiment's starting state.

Key considerations:

  • Temporary directory isolation prevents experiments from modifying the main workspace
  • The `--copy-paths` flag includes untracked files that the pipeline needs
  • Each experiment gets a unique identifier derived from its Git commit SHA
  • The workspace queue manages experiment lifecycle and cleanup

Step 3: Execute Pipeline

The experiment pipeline is reproduced within the isolated workspace using the same reproduction engine as `dvc repro`. All stages are executed in dependency order, with parameter overrides applied. The execution can happen immediately (blocking) or be queued for later execution via the Celery task queue.

Key considerations:

  • Queued execution uses Celery workers and supports parallel jobs via `--jobs`
  • Immediate execution blocks until the pipeline completes
  • The `--run-all` flag processes all queued experiments
  • Failed experiments are tracked with their error state preserved

Step 4: Capture Results

After pipeline execution, DVC captures the experiment results by collecting all metrics, parameters, and output artifacts. The results are stored as a Git commit under the `refs/exps/` namespace. Metrics from JSON/YAML/TOML files are extracted for comparison. An optional experiment name and commit message can be assigned.

Key considerations:

  • Metrics files declared in `dvc.yaml` are automatically collected
  • Experiment names must be unique within the repository
  • The `dvc exp save` command can capture the current workspace state as an experiment without re-running
  • Studio integration sends live experiment updates via webhooks when configured

Step 5: Compare Experiments

DVC provides tools to compare experiments side by side. The `dvc exp show` command displays a table of all experiments with their metrics, parameters, and metadata. Experiments can be sorted by any metric, filtered by state, and displayed in various formats (table, JSON, CSV).

Key considerations:

  • Experiments are organized hierarchically under their baseline commits
  • Sorting supports ascending and descending order on any metric or parameter
  • Queued, failed, and workspace experiments can be hidden via filter flags
  • The comparison includes timestamps, executor information, and experiment state

Step 6: Apply or Discard Experiments

After comparison, successful experiments can be applied to the current workspace using `dvc exp apply`, which updates all tracked files to match the experiment's state. Unwanted experiments can be removed with `dvc exp remove`. Experiments can also be pushed to and pulled from Git remotes for team collaboration.

Key considerations:

  • Applying an experiment modifies the workspace but does not create a Git commit
  • The user must explicitly commit after applying to make changes permanent
  • Experiment push/pull transfers both the Git refs and associated DVC data
  • The `dvc exp branch` command creates a full Git branch from an experiment

Execution Diagram

GitHub URL

Workflow Repository