Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Iterative Dvc Experiments Run

From Leeroopedia


Knowledge Sources
Domains Experiment_Management, Pipeline_Execution
Last Updated 2026-02-10 00:00 GMT

Overview

Concrete tool for executing DVC pipelines as tracked experiments with parameter overrides and multiple execution modes, provided by the DVC library.

Description

The run function in dvc.repo.experiments.run is the primary entry point for running experiments in DVC. It orchestrates the full experiment lifecycle: parsing parameter overrides, detecting sweep configurations, selecting the appropriate execution mode (immediate, queued, or distributed), and dispatching to the corresponding queue and reproduction mechanism.

The function integrates with three execution backends. For immediate execution, it calls Experiments.reproduce_one, which enqueues to either the WorkspaceQueue (in-place) or TempDirQueue (isolated temp directory) and immediately reproduces. For queued execution, it pushes entries to the LocalCeleryQueue via Experiments.queue_one, optionally expanding sweep overrides into multiple queue entries. For distributed execution (run_all=True), it calls Experiments.reproduce_celery which starts Celery workers and processes all queued experiments.

The function also handles Hydra integration: when Hydra is enabled in the repo configuration, it ensures that the default params file is included in the overrides even if the user did not specify --set-param, enabling Hydra composition to run for every experiment.

Usage

Import and use this function when:

  • You need to programmatically trigger experiment runs from Python code
  • You are building automation that submits experiments with specific parameter configurations
  • You need to process a queue of experiments with parallel workers

Code Reference

Source Location

  • Repository: DVC
  • File: dvc/repo/experiments/run.py
  • Lines: L14-113 (run)
  • Related: dvc/repo/experiments/__init__.py L114-132 (reproduce_one), L138-184 (reproduce_celery)

Signature

@locked
def run(
    repo,
    targets: Optional[Iterable[str]] = None,
    params: Optional[Iterable[str]] = None,
    run_all: bool = False,
    jobs: int = 1,
    tmp_dir: bool = False,
    queue: bool = False,
    copy_paths: Optional[Iterable[str]] = None,
    message: Optional[str] = None,
    no_hydra: bool = False,
    **kwargs,
) -> dict[str, str]:
    ...

Import

from dvc.repo.experiments.run import run

I/O Contract

Inputs

Name Type Required Description
repo Repo Yes The DVC repository instance. Provides access to SCM, experiment queues, and configuration.
targets Optional[Iterable[str]] No Pipeline stage targets to reproduce. If None, all stages in the pipeline are reproduced.
params Optional[Iterable[str]] No Parameter override strings in the format "file:key=value". Parsed into path-to-overrides mapping via to_path_overrides.
run_all bool No If True, reproduces all currently queued experiments via Celery workers. Defaults to False.
jobs int No Number of parallel Celery workers to start when run_all=True. Defaults to 1.
tmp_dir bool No If True, runs the experiment in a temporary directory for isolation. Defaults to False.
queue bool No If True, queues the experiment for later execution instead of running immediately. Defaults to False.
copy_paths Optional[Iterable[str]] No Additional file paths to copy into the experiment workspace during execution.
message Optional[str] No Custom commit message for the experiment's Git commit.
no_hydra bool No If True, disables Hydra configuration composition even if Hydra is enabled in the repo config. Defaults to False.

Outputs

Name Type Description
return dict[str, str] Dictionary mapping experiment revision SHAs to experiment names/hashes. Empty dict if experiments were queued (results are collected later).

Usage Examples

Basic Usage: Run Experiment Immediately

from dvc.repo import Repo
from dvc.repo.experiments.run import run

with Repo() as repo:
    results = run(
        repo,
        params=["params.yaml:train.lr=0.001"],
    )
    for exp_rev, exp_name in results.items():
        print(f"Experiment {exp_name}: {exp_rev[:7]}")

Queue Experiments with Sweep

from dvc.repo import Repo
from dvc.repo.experiments.run import run

with Repo() as repo:
    # Queue multiple experiments via parameter sweep
    run(
        repo,
        params=[
            "params.yaml:train.lr=choice(0.001,0.01,0.1)",
            "params.yaml:train.batch_size=choice(32,64)",
        ],
        queue=True,
        name="grid-search",
    )
    # 6 experiments queued as "grid-search-1" through "grid-search-6"

Run All Queued Experiments with Parallel Workers

from dvc.repo import Repo
from dvc.repo.experiments.run import run

with Repo() as repo:
    # Execute all queued experiments with 4 parallel workers
    results = run(repo, run_all=True, jobs=4)
    for exp_rev, exp_name in results.items():
        print(f"Completed: {exp_name} ({exp_rev[:7]})")

Run in Temporary Directory

from dvc.repo import Repo
from dvc.repo.experiments.run import run

with Repo() as repo:
    # Run experiment in isolated temp dir
    results = run(
        repo,
        params=["params.yaml:train.lr=0.005"],
        tmp_dir=True,
    )

Related Pages

Implements Principle

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment