Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Iterative Dvc Dependency Repo

From Leeroopedia


Knowledge Sources
Domains Dependency_Management, Cross_Repository
Last Updated 2026-02-10 10:00 GMT

Overview

RepoDependency is a class defined in dvc/dependency/repo.py (168 lines) that extends the base Dependency class. It represents a dependency on a file or directory in an external DVC repository. It handles filesystem resolution through DVCFileSystem, revision locking, and cross-repo file downloads with hash caching.

from dvc.dependency.repo import RepoDependency

Source File

Property Value
File dvc/dependency/repo.py
Lines 168
Class RepoDependency
Extends Dependency (from dvc.dependency.base)

Class: RepoDependency

RepoDependency models a stage dependency that points to a path in another Git/DVC repository. The remote repo is specified via a URL, optional revision, and optional config/remote overrides.

Class Attributes

Attribute Value Description
PARAM_REPO "repo" Top-level key in serialized representation
PARAM_URL "url" Remote repository URL
PARAM_REV "rev" Requested Git revision (branch, tag, or commit)
PARAM_REV_LOCK "rev_lock" Locked (pinned) Git revision
PARAM_CONFIG "config" Optional DVC config override (path or dict)
PARAM_REMOTE "remote" Optional remote storage override

The REPO_SCHEMA class variable defines a voluptuous validation schema requiring the url field and accepting optional rev, rev_lock, config, and remote fields.

Constructor

def __init__(self, def_repo: dict[str, Any], stage: "Stage", *args, **kwargs)

Stores the repository definition dict, calls the parent constructor, and immediately constructs a DVCFileSystem via _make_fs(). The filesystem path is normalized to a POSIX-style path.

Properties

is_in_repo

Always returns False since the dependency refers to an external repository.

Methods

workspace_status

def workspace_status(self) -> dict

Compares the locked revision (from _make_fs(locked=True)) with the latest revision (from _make_fs(locked=False)). Returns {str(self): "update available"} if they differ, or an empty dict if the dependency is up to date.

save

def save(self) -> None

Records the current Git revision into self.def_repo[PARAM_REV_LOCK] if it has not been set yet. This pins the dependency to a specific commit.

dumpd

def dumpd(self, **kwargs) -> dict[str, Union[str, dict[str, str]]]

Serializes the dependency to a dictionary containing path and repo keys. The repo sub-dict is built by _dump_def_repo(), which includes only non-empty fields (url, rev, rev_lock, config, remote).

download

def download(self, to: "Output", jobs: Optional[int] = None)

Downloads the dependency content to the target output by:

  1. Calling super().download() to perform the file transfer
  2. If the target filesystem is local, iterating over downloaded files to extract DVC hash info
  3. Saving the hash information to the output cache state, avoiding re-hashing of already known files

update

def update(self, rev: Optional[str] = None) -> None

Updates the dependency to track a new revision. If a rev is specified, it updates the requested revision. Then it re-creates the filesystem with locked=False and records the resolved revision as the new rev_lock.

changed_checksum

def changed_checksum(self) -> bool

Always returns False. From the current repo's perspective, a RepoDependency is described by its URL and rev_lock, making it effectively immutable once locked.

_make_fs (private)

def _make_fs(self, rev: Optional[str] = None, locked: bool = True) -> "DVCFileSystem"

Constructs a DVCFileSystem instance for the remote repository. Key logic includes:

  • Handling remote as either a string name or a dict of config values
  • Loading config from a file path or using a dict directly
  • Injecting the local repo's cache configuration into the remote filesystem config to avoid re-streaming data
  • Passing subrepos=True to support monorepo structures

_get_rev (private)

def _get_rev(self, locked: bool = True) -> Optional[str]

Returns the locked revision when locked=True, falling back to the requested revision. When locked=False, returns only the requested revision.

Key Dependencies

Module Usage
dvc.dependency.base.Dependency Base dependency class
dvc.fs.DVCFileSystem Filesystem abstraction for remote DVC repos
dvc.config.Config Loading config files for remote repos
dvc.fs.LocalFileSystem Checking if download target is local
voluptuous Schema validation for repo definition
dvc.utils.as_posix Path normalization

See Also

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment