Implementation:Iterative Dvc VegaConverter Convert
| Knowledge Sources | |
|---|---|
| Domains | Visualization, Data_Transformation |
| Last Updated | 2026-02-10 00:00 GMT |
Overview
Concrete tool for converting parsed data records into Vega-Lite compatible datapoint lists with standardized metadata fields, provided by the DVC library.
Description
The VegaConverter class is responsible for transforming raw parsed data (lists of dictionaries from CSV, JSON, or YAML files) into the flat datapoint format required by Vega-Lite chart templates. Its convert method performs axis field inference and builds a file-to-datapoints mapping, while flat_datapoints extends this by annotating every datapoint with rev (revision), filename (source file), and field (original field name) metadata, and handling multi-source field unification.
The class inherits from the Converter base class (at dvc/render/converter/__init__.py) and is specifically designed for Vega-type renderers. It works together with ImageConverter (at dvc/render/converter/image.py), which handles image plot types by converting raw bytes to base64-encoded data URIs or writing them to output files.
The converter uses constants from dvc.render: FIELD ("field"), FILENAME ("filename"), INDEX ("step"), and REVISION ("rev") to maintain a consistent data schema across all chart types.
Usage
Use VegaConverter when you have parsed plot data and need to transform it into Vega-Lite datapoints for chart rendering. This is typically instantiated by _get_converter in dvc/render/convert.py based on the renderer type, and called from match_defs_renderers in the rendering pipeline. Use ImageConverter for image-type plots that produce base64 data URIs instead of tabular datapoints.
Code Reference
Source Location
- Repository: DVC
- File:
dvc/render/converter/vega.py - Lines: L292-306 (VegaConverter.convert), L192-290 (VegaConverter.flat_datapoints)
- File:
dvc/render/converter/image.py - Lines: L35-39 (ImageConverter.convert), L41-60 (ImageConverter.flat_datapoints)
Signature
class VegaConverter(Converter):
def __init__(
self,
plot_id: str,
data: Optional[dict] = None,
properties: Optional[dict] = None,
):
...
def convert(self) -> tuple[dict, dict]:
"""
Convert the data. Fill necessary fields ('x', 'y') and return both
generated datapoints and updated properties. x, y values and labels
are inferred and always provided.
"""
...
def flat_datapoints(
self, revision: str,
) -> tuple[list[dict], dict]:
...
class ImageConverter(Converter):
def convert(self) -> tuple[list[tuple[str, str, Any]], dict]:
...
def flat_datapoints(
self, revision: str,
) -> tuple[list[dict], dict]:
...
Import
from dvc.render.converter.vega import VegaConverter
from dvc.render.converter.image import ImageConverter
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| plot_id | str | Yes | The identifier for this plot, typically a file path (e.g., "plots/loss.csv") or a logical name from the pipeline definition. |
| data | Optional[dict] | No | Dictionary mapping filenames to their parsed content (list of dicts for tabular, raw bytes for images). Defaults to empty dict. |
| properties | Optional[dict] | No | Display properties including "x", "y", "template", "x_label", "y_label", "title". Defaults to empty dict. |
| revision | str | Yes | (For flat_datapoints) The revision identifier to stamp onto each datapoint's REVISION field. |
Outputs
| Name | Type | Description |
|---|---|---|
| file2datapoints | dict | (From VegaConverter.convert) Dictionary mapping filenames to lists of datapoint dictionaries, with inferred x/y fields populated. |
| properties | dict | (From VegaConverter.convert) Updated properties dictionary with inferred "x", "y", "x_label", and "y_label" values filled in. |
| datapoints | list[dict] | (From VegaConverter.flat_datapoints) Flat list of datapoint dictionaries. Each dict contains the original data fields plus "rev", "filename", "field", and "step" (index) metadata. When multiple y fields exist, a "dvc_inferred_y_value" field unifies them. |
| properties | dict | (From VegaConverter.flat_datapoints) Final properties with "x", "y", and "anchors_y_definitions" added for template consumption. |
| datapoints | list[dict] | (From ImageConverter.flat_datapoints) List of dicts with "rev", "filename", and "src" (base64 data URI or file path) keys. |
| properties | dict | (From ImageConverter.flat_datapoints) Pass-through properties dictionary. |
Usage Examples
Basic Usage
from dvc.render.converter.vega import VegaConverter
# Simulate parsed CSV data for a single file
data = {
"metrics/loss.csv": [
{"epoch": "1", "train_loss": "0.95", "val_loss": "1.10"},
{"epoch": "2", "train_loss": "0.80", "val_loss": "0.92"},
{"epoch": "3", "train_loss": "0.65", "val_loss": "0.78"},
]
}
# Create converter with explicit x and y
converter = VegaConverter(
plot_id="metrics/loss.csv",
data=data,
properties={"x": "epoch", "y": "train_loss"},
)
# Phase 1: convert() infers properties and maps files to datapoints
file2datapoints, props = converter.convert()
# file2datapoints = {"metrics/loss.csv": [{"epoch": "1", ...}, ...]}
# props = {"x": {"metrics/loss.csv": "epoch"}, "y": {"metrics/loss.csv": "train_loss"},
# "x_label": "epoch", "y_label": "train_loss"}
# Phase 2: flat_datapoints() adds revision metadata
datapoints, final_props = converter.flat_datapoints("workspace")
# datapoints = [
# {"epoch": "1", "train_loss": "0.95", "val_loss": "1.10",
# "rev": "workspace", "filename": "loss.csv", "field": "train_loss"},
# ...
# ]
Auto-Inference of Y Field
from dvc.render.converter.vega import VegaConverter
data = {
"metrics.json": [
{"step": 1, "accuracy": 0.75},
{"step": 2, "accuracy": 0.82},
]
}
# No y specified - converter infers last field ("accuracy")
converter = VegaConverter(
plot_id="metrics.json",
data=data,
properties={},
)
datapoints, props = converter.flat_datapoints("v1.0")
# props["y"] == "accuracy" (inferred)
# props["x"] == "step" (auto-index since no x specified)
# Each datapoint has rev="v1.0"
Multi-File Y Sources
from dvc.render.converter.vega import VegaConverter
data = {
"train.csv": [{"step": "1", "loss": "0.9"}, {"step": "2", "loss": "0.7"}],
"eval.csv": [{"step": "1", "score": "0.6"}, {"step": "2", "score": "0.8"}],
}
# Multiple y sources with different field names
converter = VegaConverter(
plot_id="comparison",
data=data,
properties={
"x": "step",
"y": {"train.csv": "loss", "eval.csv": "score"},
},
)
datapoints, props = converter.flat_datapoints("workspace")
# props["y"] == "dvc_inferred_y_value" (unified because "loss" != "score")
# Each datapoint has its original value copied to "dvc_inferred_y_value"
Image Conversion
from dvc.render.converter.image import ImageConverter
data = {
"plots/confusion_matrix.png": b'\x89PNG\r\n...',
}
converter = ImageConverter(
plot_id="plots/confusion_matrix.png",
data=data,
properties={},
)
datapoints, props = converter.flat_datapoints("workspace")
# datapoints = [
# {"rev": "workspace", "filename": "plots/confusion_matrix.png",
# "src": "data:image;base64,iVBORw0KGgo..."}
# ]