Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Iterative Dvc Commands Stage

From Leeroopedia
Revision as of 15:18, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Iterative_Dvc_Commands_Stage.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Knowledge Sources
Domains CLI, Pipeline_Management
Last Updated 2026-02-10 10:00 GMT

Overview

Concrete tool for managing DVC pipeline stages through the command-line interface.

Description

The dvc.commands.stage module implements the CLI commands for managing DVC pipeline stages. It provides two primary command classes: CmdStageList for listing existing stages defined in the project and CmdStageAdd for adding new stages to the pipeline definition file (dvc.yaml). These commands handle argument parsing, validation of stage parameters (dependencies, outputs, metrics, plots), and delegation to the underlying repository layer for pipeline manipulation. The module bridges user-facing CLI interactions with the internal stage management logic.

Usage

Use dvc stage list to inspect the stages currently defined in a DVC project and dvc stage add to define new pipeline stages with their commands, dependencies, outputs, metrics, and parameter specifications. These commands are the primary interface for building and maintaining reproducible ML pipelines.

Code Reference

Source Location

Signature

class CmdStageList(CmdBase):
    """Command handler for 'dvc stage list'.

    Lists all stages defined in the DVC pipeline, optionally filtering
    by name pattern and displaying dependency/output details.
    """

    def run(self) -> int:
        """Executes the stage listing logic.

        Returns:
            int: 0 on success, non-zero on failure.
        """

class CmdStageAdd(CmdBase):
    """Command handler for 'dvc stage add'.

    Adds a new stage to dvc.yaml with the specified name, command,
    dependencies, outputs, metrics, plots, and parameters.
    """

    def run(self) -> int:
        """Executes the stage addition logic.

        Returns:
            int: 0 on success, non-zero on failure.
        """

Import

from dvc.commands.stage import CmdStageList, CmdStageAdd

I/O Contract

Inputs (CmdStageList)

Name Type Required Description
--all flag No List stages from all pipelines in the repository, not just dvc.yaml in the current directory
--name-only flag No Display only stage names without additional details
target str No Specific dvc.yaml file or directory to list stages from

Inputs (CmdStageAdd)

Name Type Required Description
-n, --name str Yes Name of the stage to add
command str Yes Shell command to execute for this stage
-d, --deps str (repeatable) No Dependency file paths for this stage
-o, --outs str (repeatable) No Output file paths produced by this stage
-m, --metrics str (repeatable) No Metrics file paths to track
--metrics-no-cache str (repeatable) No Metrics files that should not be cached
--plots str (repeatable) No Plot file paths to track
--plots-no-cache str (repeatable) No Plot files that should not be cached
-p, --params str (repeatable) No Parameter dependencies (file:key format)
-w, --wdir str No Working directory for the stage command
-f, --force flag No Overwrite an existing stage with the same name

Outputs

Name Type Description
CmdStageList output stdout text Formatted list of stages with optional dependency and output details
CmdStageAdd output dvc.yaml modification New stage entry written to the pipeline definition file
Return code int 0 on success, non-zero on failure

Usage Examples

List All Stages

# List all stages defined in the current project
dvc stage list

# Output:
# prepare    Outputs: data/prepared
# featurize  Outputs: data/features
# train      Outputs: model.pkl
# evaluate   Outputs: eval/

# List stage names only across all pipelines
dvc stage list --all --name-only

Add a Training Stage

# Add a new training stage with dependencies, outputs, params, and metrics
dvc stage add -n train \
    -d src/train.py \
    -d data/features \
    -p params.yaml:train.lr,train.epochs \
    -o model.pkl \
    -m metrics.json \
    python src/train.py

Add a Data Preparation Stage

# Add a data preparation stage
dvc stage add -n prepare \
    -d src/prepare.py \
    -d data/raw \
    -o data/prepared \
    -p params.yaml:prepare.split_ratio \
    python src/prepare.py

# Overwrite an existing stage
dvc stage add -n prepare --force \
    -d src/prepare_v2.py \
    -d data/raw \
    -o data/prepared \
    python src/prepare_v2.py

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment