Implementation:Iterative Dvc Hydra Sweeps
| Knowledge Sources | |
|---|---|
| Domains | Experiment_Management, Hyperparameter_Tuning |
| Last Updated | 2026-02-10 00:00 GMT |
Overview
Concrete tool for generating combinatorial parameter sweeps and applying parameter overrides to configuration files, provided by the DVC library as a wrapper around the Hydra configuration framework.
Description
The dvc.utils.hydra module provides two primary functions that bridge DVC's experiment management system with the Hydra configuration framework. get_hydra_sweeps takes a dictionary mapping parameter file paths to lists of override strings and expands any sweep overrides (e.g., choice(), range()) into a list of concrete parameter dictionaries via Cartesian product expansion. apply_overrides takes a file path and a list of override strings and modifies the file in place, using Hydra's override parser and OmegaConf to apply structured configuration changes.
These functions wrap Hydra's OverridesParser, BasicSweeper, and ConfigLoaderImpl, handling the translation between DVC's parameter file format (YAML, JSON, TOML) and Hydra's internal configuration representation. The module also includes helpers such as to_hydra_overrides for parsing override strings and dict_product for computing the Cartesian product of per-path sweep dimensions.
Usage
Import and use these functions when:
- You need to expand parameter sweep overrides into individual experiment configurations before queuing
- You need to apply parameter modifications to a configuration file on disk as part of experiment preparation
- You are building tooling that integrates with DVC's
--set-paramCLI interface
Code Reference
Source Location
- Repository: DVC
- File:
dvc/utils/hydra.py - Lines: L136-149 (
get_hydra_sweeps), L83-120 (apply_overrides)
Signature
def get_hydra_sweeps(
path_overrides: dict[str, list[str]],
) -> list[dict]:
...
def apply_overrides(
path: "StrPath",
overrides: list[str],
) -> None:
...
Import
from dvc.utils.hydra import get_hydra_sweeps, apply_overrides
I/O Contract
Inputs
get_hydra_sweeps:
| Name | Type | Required | Description |
|---|---|---|---|
| path_overrides | dict[str, list[str]] |
Yes | Dictionary mapping parameter file paths to lists of Hydra override strings. Each override string follows the Hydra override grammar (e.g., "lr=0.01", "lr=choice(0.001,0.01)"). Sweep overrides are expanded via Cartesian product.
|
apply_overrides:
| Name | Type | Required | Description |
|---|---|---|---|
| path | StrPath |
Yes | Path to the parameter file to modify (supports YAML, JSON, TOML via suffix-based dispatch). |
| overrides | list[str] |
Yes | List of Hydra override strings to apply to the file contents. Each override targets a key path and sets a new value. |
Outputs
get_hydra_sweeps:
| Name | Type | Description |
|---|---|---|
| return | list[dict] |
List of dictionaries, where each dictionary maps parameter file paths to lists of concrete (non-sweep) override strings. Each dictionary represents one point in the Cartesian product of all sweep dimensions. |
apply_overrides:
| Name | Type | Description |
|---|---|---|
| return | None |
The function modifies the parameter file at path in place. No return value.
|
Usage Examples
Basic Usage: Generating Sweep Combinations
from dvc.utils.hydra import get_hydra_sweeps
# Define overrides with sweep syntax
path_overrides = {
"params.yaml": [
"train.lr=choice(0.001,0.01,0.1)",
"train.batch_size=choice(32,64)",
]
}
# Expand into individual experiment configurations
sweeps = get_hydra_sweeps(path_overrides)
# Returns 6 dictionaries (3 lr values x 2 batch_size values),
# each mapping "params.yaml" to a list of concrete overrides like:
# [
# {"params.yaml": ["train.lr=0.001", "train.batch_size=32"]},
# {"params.yaml": ["train.lr=0.001", "train.batch_size=64"]},
# {"params.yaml": ["train.lr=0.01", "train.batch_size=32"]},
# ...
# ]
Basic Usage: Applying Overrides to a File
from dvc.utils.hydra import apply_overrides
# Apply single-point overrides to a YAML params file
apply_overrides("params.yaml", ["train.lr=0.001", "train.epochs=50"])
# The file params.yaml is now updated with the new values
Advanced Usage: Multi-File Sweeps
from dvc.utils.hydra import get_hydra_sweeps
# Sweep across parameters in multiple config files
path_overrides = {
"params.yaml": ["model.hidden_dim=choice(128,256)"],
"train_config.yaml": ["optimizer=choice(adam,sgd)"],
}
sweeps = get_hydra_sweeps(path_overrides)
# Returns 4 combinations (2 hidden_dim x 2 optimizer),
# each dict has keys for both param files
for sweep in sweeps:
print(sweep)