Implementation:Iterative Dvc SCMContext Track Changed Files
| Knowledge Sources | |
|---|---|
| Domains | Version_Control, SCM_Integration |
| Last Updated | 2026-02-10 00:00 GMT |
Overview
Concrete tool for automatically staging DVC-generated metadata files in Git after data tracking operations, provided by the DVC library.
Description
The SCMContext class in DVC's dvc/repo/scm_context.py module manages the coordination between DVC operations and the Git staging area. Its __call__ method serves as a context manager that wraps any DVC operation, collecting all files that are created or modified during the operation and then either auto-staging them in Git or displaying the appropriate git add command for the user.
During a DVC operation, various subsystems register files with the context by calling track_file(paths). For example, when SingleStageFile.dump writes a .dvc file, it calls repo.scm_context.track_file(self.relpath). When Output.ignore adds an entry to .gitignore, the resulting .gitignore file path is also registered. These paths accumulate in the files_to_track set.
When the context manager exits normally, the __call__ method checks whether auto-staging is enabled (via core.autostage configuration or explicit parameter). If auto-staging is enabled, it calls track_changed_files(), which delegates to self.add(self.files_to_track) to run the equivalent of git add on all collected files. If auto-staging is disabled, it logs a user-friendly message showing the exact git add command needed to manually stage the files.
If the operation fails (raises an exception), the context manager rolls back any .gitignore entries that were added during the operation by calling ignore_remove for each path in ignored_paths. This ensures that a failed operation does not leave orphaned .gitignore entries that would prevent Git from tracking files that DVC is no longer managing.
The companion scm_context decorator function provides a convenient way to wrap repository methods. It wraps any function so that it executes within a repo.scm_context() context manager, automatically handling the file tracking and staging lifecycle.
Usage
Use SCMContext when building custom DVC operations that create or modify files that should be tracked by Git. Use the scm_context decorator to wrap repository-level functions (like add, remove, move) that produce Git-relevant side effects. The context is already applied to all standard DVC commands; direct usage is only needed for custom tooling or extensions.
Code Reference
Source Location
- Repository: DVC
- File:
dvc/repo/scm_context.py - Lines: L96-132 (__call__), L58-63 (track_changed_files)
Signature
class SCMContext:
def __init__(
self,
scm: "Base",
config: Optional[dict[str, Any]] = None,
) -> None:
...
def track_file(
self,
paths: Union[str, Iterable[str], None] = None,
) -> None:
"""Register files to be tracked/staged after the operation."""
...
def track_changed_files(self) -> None:
"""Stage all registered files via git add."""
...
@contextmanager
def __call__(
self,
autostage: Optional[bool] = None,
quiet: Optional[bool] = None,
) -> Iterator["SCMContext"]:
"""Context manager that collects file changes and stages them on exit.
On success: auto-stages files or displays git add command.
On failure: rolls back .gitignore modifications.
"""
...
def scm_context(
method,
autostage: Optional[bool] = None,
quiet: Optional[bool] = None,
):
"""Decorator that wraps a repo method in an SCMContext context manager."""
...
Import
from dvc.repo.scm_context import SCMContext, scm_context
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| scm | Base |
Yes | The SCM (source code management) backend instance, typically a Git object from scmrepo. Provides ignore, ignore_remove, and is_ignored methods. |
| config | Optional[dict[str, Any]] |
No | Repository configuration dictionary. The core.autostage key controls whether files are automatically staged. Defaults to None (autostage disabled). |
| autostage | Optional[bool] |
No | Override for the autostage setting when entering the context manager. If None, falls back to the value from config. Defaults to None. |
| quiet | Optional[bool] |
No | When True, suppresses the "To track changes with git, run:" message when autostage is disabled. Defaults to None (uses instance default of False). |
| files_to_track | set[str] |
N/A | Internal accumulator. Files are registered by calling track_file() during the operation. Not passed as a parameter. |
Outputs
| Name | Type | Description |
|---|---|---|
| (__call__ yields) | SCMContext |
The context manager yields the SCMContext instance itself, allowing the wrapped code to register files via track_file(). |
| (track_changed_files return) | None |
No return value. Side effect: all paths in files_to_track are staged in Git via git add. The files_to_track set is cleared after staging. |
| (user message) | str (logged) |
When autostage is disabled and files need tracking, a log message is emitted showing the exact git add command. For example: To track the changes with git, run: git add data.csv.dvc .gitignore. |
Usage Examples
Basic Usage
from dvc.repo import Repo
repo = Repo()
# Using the context manager directly
with repo.scm_context(autostage=True) as ctx:
# Perform operations that create/modify git-tracked files
# ... write a .dvc file ...
ctx.track_file("data.csv.dvc")
ctx.track_file(".gitignore")
# On exit: data.csv.dvc and .gitignore are automatically git-added
# Using the scm_context decorator
from dvc.repo.scm_context import scm_context
@scm_context
def my_custom_operation(repo, target):
"""Custom operation that produces git-trackable files."""
# ... create or modify .dvc files ...
repo.scm_context.track_file("output.dvc")
# On function exit: files are auto-staged or user is prompted
# With autostage disabled (default), the user sees:
# To track the changes with git, run:
# git add data.csv.dvc .gitignore
# To enable auto staging, run:
# dvc config core.autostage true