Implementation:Iterative Dvc Experiments Run
| Knowledge Sources | |
|---|---|
| Domains | Experiment_Management, Pipeline_Execution |
| Last Updated | 2026-02-10 00:00 GMT |
Overview
Concrete tool for executing DVC pipelines as tracked experiments with parameter overrides and multiple execution modes, provided by the DVC library.
Description
The run function in dvc.repo.experiments.run is the primary entry point for running experiments in DVC. It orchestrates the full experiment lifecycle: parsing parameter overrides, detecting sweep configurations, selecting the appropriate execution mode (immediate, queued, or distributed), and dispatching to the corresponding queue and reproduction mechanism.
The function integrates with three execution backends. For immediate execution, it calls Experiments.reproduce_one, which enqueues to either the WorkspaceQueue (in-place) or TempDirQueue (isolated temp directory) and immediately reproduces. For queued execution, it pushes entries to the LocalCeleryQueue via Experiments.queue_one, optionally expanding sweep overrides into multiple queue entries. For distributed execution (run_all=True), it calls Experiments.reproduce_celery which starts Celery workers and processes all queued experiments.
The function also handles Hydra integration: when Hydra is enabled in the repo configuration, it ensures that the default params file is included in the overrides even if the user did not specify --set-param, enabling Hydra composition to run for every experiment.
Usage
Import and use this function when:
- You need to programmatically trigger experiment runs from Python code
- You are building automation that submits experiments with specific parameter configurations
- You need to process a queue of experiments with parallel workers
Code Reference
Source Location
- Repository: DVC
- File:
dvc/repo/experiments/run.py - Lines: L14-113 (
run) - Related:
dvc/repo/experiments/__init__.pyL114-132 (reproduce_one), L138-184 (reproduce_celery)
Signature
@locked
def run(
repo,
targets: Optional[Iterable[str]] = None,
params: Optional[Iterable[str]] = None,
run_all: bool = False,
jobs: int = 1,
tmp_dir: bool = False,
queue: bool = False,
copy_paths: Optional[Iterable[str]] = None,
message: Optional[str] = None,
no_hydra: bool = False,
**kwargs,
) -> dict[str, str]:
...
Import
from dvc.repo.experiments.run import run
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| repo | Repo |
Yes | The DVC repository instance. Provides access to SCM, experiment queues, and configuration. |
| targets | Optional[Iterable[str]] |
No | Pipeline stage targets to reproduce. If None, all stages in the pipeline are reproduced.
|
| params | Optional[Iterable[str]] |
No | Parameter override strings in the format "file:key=value". Parsed into path-to-overrides mapping via to_path_overrides.
|
| run_all | bool |
No | If True, reproduces all currently queued experiments via Celery workers. Defaults to False.
|
| jobs | int |
No | Number of parallel Celery workers to start when run_all=True. Defaults to 1.
|
| tmp_dir | bool |
No | If True, runs the experiment in a temporary directory for isolation. Defaults to False.
|
| queue | bool |
No | If True, queues the experiment for later execution instead of running immediately. Defaults to False.
|
| copy_paths | Optional[Iterable[str]] |
No | Additional file paths to copy into the experiment workspace during execution. |
| message | Optional[str] |
No | Custom commit message for the experiment's Git commit. |
| no_hydra | bool |
No | If True, disables Hydra configuration composition even if Hydra is enabled in the repo config. Defaults to False.
|
Outputs
| Name | Type | Description |
|---|---|---|
| return | dict[str, str] |
Dictionary mapping experiment revision SHAs to experiment names/hashes. Empty dict if experiments were queued (results are collected later). |
Usage Examples
Basic Usage: Run Experiment Immediately
from dvc.repo import Repo
from dvc.repo.experiments.run import run
with Repo() as repo:
results = run(
repo,
params=["params.yaml:train.lr=0.001"],
)
for exp_rev, exp_name in results.items():
print(f"Experiment {exp_name}: {exp_rev[:7]}")
Queue Experiments with Sweep
from dvc.repo import Repo
from dvc.repo.experiments.run import run
with Repo() as repo:
# Queue multiple experiments via parameter sweep
run(
repo,
params=[
"params.yaml:train.lr=choice(0.001,0.01,0.1)",
"params.yaml:train.batch_size=choice(32,64)",
],
queue=True,
name="grid-search",
)
# 6 experiments queued as "grid-search-1" through "grid-search-6"
Run All Queued Experiments with Parallel Workers
from dvc.repo import Repo
from dvc.repo.experiments.run import run
with Repo() as repo:
# Execute all queued experiments with 4 parallel workers
results = run(repo, run_all=True, jobs=4)
for exp_rev, exp_name in results.items():
print(f"Completed: {exp_name} ({exp_rev[:7]})")
Run in Temporary Directory
from dvc.repo import Repo
from dvc.repo.experiments.run import run
with Repo() as repo:
# Run experiment in isolated temp dir
results = run(
repo,
params=["params.yaml:train.lr=0.005"],
tmp_dir=True,
)