Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Environment:Iterative Dvc Git SCM Environment

From Leeroopedia


Knowledge Sources
Domains Infrastructure, Version_Control
Last Updated 2026-02-10 10:00 GMT

Overview

Git repository with scmrepo integration required for DVC pipeline, experiment, and data tracking operations.

Description

DVC operates on top of Git repositories and requires a functioning Git installation for most operations. The SCM (Source Code Management) layer is provided by the `scmrepo` library, which abstracts Git operations. DVC also bundles `dulwich` (a pure-Python Git implementation) for operations that do not require a system Git binary. Experiments use Git stashing, branching, and ref management extensively.

Usage

Use this environment for all DVC operations that involve Git integration: data tracking (`dvc add` auto-stages `.dvc` files), pipeline reproduction (lockfile commits), experiment management (Git-based experiment isolation), and artifact versioning. A Git repository must be initialized before `dvc init`.

System Requirements

Category Requirement Notes
Git >= 2.0 System git binary; dulwich used as fallback for some operations
OS Linux, macOS, or Windows Windows path handling has special considerations
Disk Varies Git history + DVC metadata stored in `.dvc/` and `.git/`

Dependencies

System Packages

  • `git` >= 2.0 (system binary)

Python Packages

  • `scmrepo` >= 3.5.2, < 4 — Git abstraction layer
  • `dulwich` — Pure-Python Git implementation
  • `gto` >= 1.6.0, < 2 — Git Tag Operations for artifact versioning
  • `gitpython` (transitive via scmrepo)

Credentials

  • Standard Git credentials (SSH keys, HTTPS tokens) for remote repository access
  • `GIT_AUTHOR_NAME` / `GIT_AUTHOR_EMAIL`: Used by experiment commits
  • `PRE_COMMIT_CHECKOUT_TYPE`: Integration with pre-commit hooks (read by `dvc/commands/git_hook.py`)

Quick Install

# Git is typically pre-installed. If not:
# Ubuntu/Debian
sudo apt-get install git

# macOS
brew install git

# Initialize a DVC project within a Git repo
git init my-project && cd my-project
dvc init

Code Evidence

SCM initialization from `dvc/scm.py` requires Git:

from scmrepo.git import Git
from scmrepo.noscm import NoSCM

def resolve_rev(scm, rev):
    return scm.resolve_rev(rev)

Git auto-staging from `dvc/repo/scm_context.py:96-132`:

def track_changed_files(self):
    if not self.autostage:
        return
    self.scm.track_changed_files()

Experiment workspace uses Git stashing from `dvc/repo/experiments/__init__.py:89-90`:

# NOTE: tempdir and workspace stash is shared since both
# implementations immediately push -> pop (queue length is only 0 or 1)

Dulwich version requirement in benchmarks from `dvc/testing/benchmarks/fixtures.py:113-124`:

version_constraints = [
    ("<3.50.3", ["pygit2==1.14.1"]),
    ("<3.44.0", ["dulwich<1.0.0"]),
    ("<3.67.0", ["pathspec<1"]),
]

Common Errors

Error Message Cause Solution
`SCMError: not a git repository` DVC commands run outside a Git repo Run `git init` first, then `dvc init`
`NoSCMError` Git not installed or not in PATH Install Git and ensure it is on system PATH
`dulwich.__version__ < (0, 24, 2)` Old dulwich version `pip install --upgrade dulwich`
`RevError: unknown revision` Git ref not found Verify the branch/tag/commit exists

Compatibility Notes

  • NoSCM mode: DVC can operate without Git in limited scenarios using the `NoSCM` backend from scmrepo. Experiments and auto-staging are unavailable in this mode.
  • Windows path handling: DVC normalizes paths between POSIX and Windows formats. The `dvc/output.py` file notes a known issue (#2059) with mixed path separators.
  • Pre-commit hooks: DVC provides pre-commit hook definitions in `.pre-commit-hooks.yaml` for automatic DVC operations on Git events.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment