Implementation:Iterative Dvc Utils Strictyaml
| Knowledge Sources | |
|---|---|
| Domains | YAML_Processing, Validation |
| Last Updated | 2026-02-10 10:00 GMT |
Overview
Concrete tool for loading and validating YAML files with rich error formatting and syntax highlighting. This module combines YAML parsing and schema validation into a single entrypoint, providing user-friendly error messages with code snippets, line/column context, and colored output using Rich. It is used for parsing dvc.yaml, dvc.lock, and .dvc files.
Note: Not to be confused with the strictyaml Python library, which has similar motivations but is a separate project.
Source: dvc/utils/strictyaml.py (308 lines)
Signature
merge_conflict_marker = re.compile("^([<=>]{7}) .*$", re.MULTILINE)
def make_relpath(fs_path: str, fs: Optional["FileSystem"] = None) -> str: ...
class YAMLSyntaxError(PrettyDvcException, YAMLFileCorruptedError):
def __init__(self, path: str, yaml_text: str, exc: Exception, rev: Optional[str] = None) -> None: ...
def __pretty_exc__(self, **kwargs: Any) -> None: ...
def determine_linecol(data, paths, max_steps=5) -> tuple[Optional[int], Optional[int], int]: ...
class YAMLValidationError(PrettyDvcException):
def __init__(self, exc: "MultipleInvalid", path: Optional[str] = None, text: Optional[str] = None, rev: Optional[str] = None) -> None: ...
def __pretty_exc__(self, **kwargs: Any) -> None: ...
def validate(data: _T, schema: Callable[[_T], _T], text: Optional[str] = None, path: Optional[str] = None, rev: Optional[str] = None) -> _T: ...
def load(path: str, schema: Optional[Callable[[_T], _T]] = None, fs: Optional["FileSystem"] = None, encoding: str = "utf-8", round_trip: bool = False) -> Any: ...
Import
from dvc.utils.strictyaml import load, validate
Description
Core Functions
| Function | Description |
|---|---|
load(path, schema, fs, encoding, round_trip) |
Loads a YAML file and optionally validates it against a schema. Opens the file using the provided filesystem (or built-in open), parses it with ruamel.yaml (safe or round-trip mode), and validates using voluptuous if a schema is provided. Returns a tuple[Any, str] of (parsed_data, raw_text).
|
validate(data, schema, text, path, rev) |
Validates data against a voluptuous schema. On failure, raises YAMLValidationError with context about the file path and optional Git revision.
|
Error Classes
YAMLSyntaxError
Inherits from both PrettyDvcException and YAMLFileCorruptedError. Raised when YAML parsing fails (e.g., invalid syntax, merge conflicts). The __pretty_exc__() method produces rich-formatted error output:
- Detects merge conflict markers (
<<<<<<<,=======,>>>>>>>) and appends a hint - Extracts context and problem marks from
ruamel.yaml.MarkedYAMLError - Renders code snippets with syntax highlighting and line numbers using
rich.syntax.Syntax - Displays line/column location information
- Includes Git revision context (truncated to 7 characters) when available
YAMLValidationError
Inherits from PrettyDvcException. Raised when schema validation fails. Supports multiple errors from voluptuous.MultipleInvalid. The __pretty_exc__() method:
- Iterates over all validation errors
- Uses
determine_linecol()to locate the error position in the YAML source - Renders code snippets with a context window that grows with the number of upward steps taken to find line/column info
- Falls back to printing the raw voluptuous error message when no source context is available
Helper Functions
| Function | Description |
|---|---|
make_relpath(fs_path, fs) |
Converts an absolute filesystem path to a relative path prefixed with ./ (or .. if above the current directory). Handles both local and remote filesystems. Returns absolute paths as-is on Windows cross-drive paths.
|
determine_linecol(data, paths, max_steps=5) |
Walks upward through a CommentedMap (from ruamel.yaml) hierarchy to find line/column information for a given path. Returns (line, col, steps_taken). The step count is used to expand the code context window when the exact error location cannot be pinpointed.
|
Private Formatting Functions
def _prepare_message(message: str) -> "RichText":
"""Wrap message in red-styled Rich text."""
def _prepare_cause(cause: str) -> "RichText":
"""Wrap cause string in bold Rich text."""
def _prepare_code_snippets(code: str, start_line: int = 1, **kwargs) -> "Syntax":
"""Create a Rich Syntax object with YAML highlighting, line numbers, and indent guides."""
I/O
load(): Reads a YAML file from disk or a virtual filesystem; returnstuple[Any, str](parsed data, raw text)validate(): Accepts parsed data and a schema callable; returns validated data (passthrough from schema)- Error output: Both error classes write to stderr via
ui.error_write()with styled Rich text
Error Rendering Flow
The error rendering pipeline for both error classes follows this pattern:
- Build a list of Rich renderable objects (text, code snippets)
- Insert the main error message at position 0
- Iterate through the list, writing each item to stderr via
ui.error_write(line, styled=True)
The code snippet rendering uses rich.syntax.Syntax with the "ansi_dark" theme, word wrapping, line numbers, and indent guides enabled.
Dependencies
| Dependency | Usage |
|---|---|
ruamel.yaml |
YAML parsing (safe and round-trip modes); MarkedYAMLError and StreamMark for error location
|
voluptuous |
Schema validation; MultipleInvalid exception type
|
rich.syntax.Syntax |
Syntax-highlighted code snippet rendering |
dpath |
Navigating nested data structures by path in determine_linecol()
|
dvc.exceptions.PrettyDvcException |
Base class for pretty-printable DVC exceptions |
dvc.ui.ui |
Global console instance for error output |
dvc.utils.serialize |
EncodingError, YAMLFileCorruptedError, parse_yaml, parse_yaml_for_update
|
dvc_objects.fs.local.LocalFileSystem |
Type check for local vs. remote filesystem in make_relpath()
|