Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Iterative Dvc Repo Commit

From Leeroopedia


Knowledge Sources
Domains Data_Management, Version_Control
Last Updated 2026-02-10 10:00 GMT

Overview

dvc/repo/commit.py (154 lines) provides the commit functionality for DVC repositories. It includes the main commit() function (decorated with @locked), the commit_2_to_3() migration function for upgrading legacy hashes to DVC 3.0 format, and the prompt_to_commit() helper for interactive confirmation.

from dvc.repo.commit import commit

Source File

Property Value
File dvc/repo/commit.py
Lines 154
Functions commit, commit_2_to_3, prompt_to_commit, _prepare_message, _migrateable_dvcfiles

Function: commit

@locked
def commit(
    self,
    target=None,
    with_deps=False,
    recursive=False,
    force=False,
    allow_missing=False,
    data_only=False,
    relink=True,
)

The primary commit function, decorated with @locked to ensure process-level exclusivity during execution.

Parameters

Parameter Default Description
target None Specific stage or DVC file to commit (all if None)
with_deps False Include upstream dependencies
recursive False Recursively collect stages from subdirectories
force False Skip interactive confirmation prompt
allow_missing False Allow committing stages with missing outputs
data_only False Only commit data source stages (imports)
relink True Re-create file links after commit

Algorithm

  1. Collects stages using self.stage.collect_granular(), filtering by data_only flag
  2. Groups collected stage info objects by their DVC file (stage.dvcfile) using itertools.groupby
  3. For each DVC file group:
    • If force=True: saves the stage directly with stage.save(allow_missing=...)
    • If force=False: checks for changed entries, prompts the user for confirmation via prompt_to_commit(), then saves
    • Calls stage.commit() to finalize each stage's outputs
    • Collects stages to dump
  4. Calls dvcfile.dump_stages() to persist the committed stages (without updating the pipeline definition)
  5. Returns the list of committed stages

Function: prompt_to_commit

def prompt_to_commit(stage, changes, force=False)

Presents an interactive confirmation prompt when changes are detected in a stage. If the user declines (and force is not True), raises StageCommitError.

The prompt message is built by _prepare_message(), which constructs a human-readable description based on which components changed:

  • Dependencies and outputs changed
  • Only dependencies changed
  • Only outputs changed
  • Stage definition itself changed

Function: commit_2_to_3

@locked
@scm_context
def commit_2_to_3(repo: "Repo", dry: bool = False)

Migrates legacy DVC 2.x outputs (using md5-dos2unix hashes) to DVC 3.0 format. This function is decorated with both @locked (for process safety) and @scm_context (for SCM integration).

Algorithm

  1. Creates a targets view filtered to outputs using the md5-dos2unix hash name
  2. Identifies migratable DVC files via _migrateable_dvcfiles()
  3. If no files need migration, informs the user and returns
  4. If dry=True, lists the files that would be migrated without making changes
  5. Otherwise, iterates over stages and:
    • Updates legacy hash names on matching outputs via out.update_legacy_hash_name(force=True)
    • Updates legacy hash names on in-repo dependencies (for non-import stages)
    • Saves and commits each modified stage
    • Dumps the updated stage file

Function: _migrateable_dvcfiles (private)

def _migrateable_dvcfiles(view: "IndexView") -> set[str]

Scans the given IndexView and returns the set of DVC file paths that contain outputs or dependencies using the legacy md5-dos2unix hash name. For ProjectFile instances, the result includes both the DVC file and its lockfile path.

Function: _prepare_message (private)

def _prepare_message(stage, changes) -> str

Builds a human-readable confirmation message based on the changes tuple (changed_deps, changed_outs, changed_stage). Appends "Are you sure you want to commit it?" to the description.

Key Dependencies

Module Usage
dvc.repo.locked (@locked) Ensures exclusive process access during commit
dvc.repo.scm_context (@scm_context) SCM integration for migration function
dvc.prompt Interactive user confirmation
dvc.stage.exceptions.StageCommitError Raised when user declines commit
dvc.dvcfile.ProjectFile Project-level DVC file handling
itertools.groupby Grouping stages by DVC file
dvc.ui User-facing output messages

See Also

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment