Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Iterative Dvc VegaConverter Convert

From Leeroopedia


Knowledge Sources
Domains Visualization, Data_Transformation
Last Updated 2026-02-10 00:00 GMT

Overview

Concrete tool for converting parsed data records into Vega-Lite compatible datapoint lists with standardized metadata fields, provided by the DVC library.

Description

The VegaConverter class is responsible for transforming raw parsed data (lists of dictionaries from CSV, JSON, or YAML files) into the flat datapoint format required by Vega-Lite chart templates. Its convert method performs axis field inference and builds a file-to-datapoints mapping, while flat_datapoints extends this by annotating every datapoint with rev (revision), filename (source file), and field (original field name) metadata, and handling multi-source field unification.

The class inherits from the Converter base class (at dvc/render/converter/__init__.py) and is specifically designed for Vega-type renderers. It works together with ImageConverter (at dvc/render/converter/image.py), which handles image plot types by converting raw bytes to base64-encoded data URIs or writing them to output files.

The converter uses constants from dvc.render: FIELD ("field"), FILENAME ("filename"), INDEX ("step"), and REVISION ("rev") to maintain a consistent data schema across all chart types.

Usage

Use VegaConverter when you have parsed plot data and need to transform it into Vega-Lite datapoints for chart rendering. This is typically instantiated by _get_converter in dvc/render/convert.py based on the renderer type, and called from match_defs_renderers in the rendering pipeline. Use ImageConverter for image-type plots that produce base64 data URIs instead of tabular datapoints.

Code Reference

Source Location

  • Repository: DVC
  • File: dvc/render/converter/vega.py
  • Lines: L292-306 (VegaConverter.convert), L192-290 (VegaConverter.flat_datapoints)
  • File: dvc/render/converter/image.py
  • Lines: L35-39 (ImageConverter.convert), L41-60 (ImageConverter.flat_datapoints)

Signature

class VegaConverter(Converter):
    def __init__(
        self,
        plot_id: str,
        data: Optional[dict] = None,
        properties: Optional[dict] = None,
    ):
        ...

    def convert(self) -> tuple[dict, dict]:
        """
        Convert the data. Fill necessary fields ('x', 'y') and return both
        generated datapoints and updated properties. x, y values and labels
        are inferred and always provided.
        """
        ...

    def flat_datapoints(
        self, revision: str,
    ) -> tuple[list[dict], dict]:
        ...
class ImageConverter(Converter):
    def convert(self) -> tuple[list[tuple[str, str, Any]], dict]:
        ...

    def flat_datapoints(
        self, revision: str,
    ) -> tuple[list[dict], dict]:
        ...

Import

from dvc.render.converter.vega import VegaConverter
from dvc.render.converter.image import ImageConverter

I/O Contract

Inputs

Name Type Required Description
plot_id str Yes The identifier for this plot, typically a file path (e.g., "plots/loss.csv") or a logical name from the pipeline definition.
data Optional[dict] No Dictionary mapping filenames to their parsed content (list of dicts for tabular, raw bytes for images). Defaults to empty dict.
properties Optional[dict] No Display properties including "x", "y", "template", "x_label", "y_label", "title". Defaults to empty dict.
revision str Yes (For flat_datapoints) The revision identifier to stamp onto each datapoint's REVISION field.

Outputs

Name Type Description
file2datapoints dict (From VegaConverter.convert) Dictionary mapping filenames to lists of datapoint dictionaries, with inferred x/y fields populated.
properties dict (From VegaConverter.convert) Updated properties dictionary with inferred "x", "y", "x_label", and "y_label" values filled in.
datapoints list[dict] (From VegaConverter.flat_datapoints) Flat list of datapoint dictionaries. Each dict contains the original data fields plus "rev", "filename", "field", and "step" (index) metadata. When multiple y fields exist, a "dvc_inferred_y_value" field unifies them.
properties dict (From VegaConverter.flat_datapoints) Final properties with "x", "y", and "anchors_y_definitions" added for template consumption.
datapoints list[dict] (From ImageConverter.flat_datapoints) List of dicts with "rev", "filename", and "src" (base64 data URI or file path) keys.
properties dict (From ImageConverter.flat_datapoints) Pass-through properties dictionary.

Usage Examples

Basic Usage

from dvc.render.converter.vega import VegaConverter

# Simulate parsed CSV data for a single file
data = {
    "metrics/loss.csv": [
        {"epoch": "1", "train_loss": "0.95", "val_loss": "1.10"},
        {"epoch": "2", "train_loss": "0.80", "val_loss": "0.92"},
        {"epoch": "3", "train_loss": "0.65", "val_loss": "0.78"},
    ]
}

# Create converter with explicit x and y
converter = VegaConverter(
    plot_id="metrics/loss.csv",
    data=data,
    properties={"x": "epoch", "y": "train_loss"},
)

# Phase 1: convert() infers properties and maps files to datapoints
file2datapoints, props = converter.convert()
# file2datapoints = {"metrics/loss.csv": [{"epoch": "1", ...}, ...]}
# props = {"x": {"metrics/loss.csv": "epoch"}, "y": {"metrics/loss.csv": "train_loss"},
#          "x_label": "epoch", "y_label": "train_loss"}

# Phase 2: flat_datapoints() adds revision metadata
datapoints, final_props = converter.flat_datapoints("workspace")
# datapoints = [
#     {"epoch": "1", "train_loss": "0.95", "val_loss": "1.10",
#      "rev": "workspace", "filename": "loss.csv", "field": "train_loss"},
#     ...
# ]

Auto-Inference of Y Field

from dvc.render.converter.vega import VegaConverter

data = {
    "metrics.json": [
        {"step": 1, "accuracy": 0.75},
        {"step": 2, "accuracy": 0.82},
    ]
}

# No y specified - converter infers last field ("accuracy")
converter = VegaConverter(
    plot_id="metrics.json",
    data=data,
    properties={},
)

datapoints, props = converter.flat_datapoints("v1.0")
# props["y"] == "accuracy" (inferred)
# props["x"] == "step" (auto-index since no x specified)
# Each datapoint has rev="v1.0"

Multi-File Y Sources

from dvc.render.converter.vega import VegaConverter

data = {
    "train.csv": [{"step": "1", "loss": "0.9"}, {"step": "2", "loss": "0.7"}],
    "eval.csv": [{"step": "1", "score": "0.6"}, {"step": "2", "score": "0.8"}],
}

# Multiple y sources with different field names
converter = VegaConverter(
    plot_id="comparison",
    data=data,
    properties={
        "x": "step",
        "y": {"train.csv": "loss", "eval.csv": "score"},
    },
)

datapoints, props = converter.flat_datapoints("workspace")
# props["y"] == "dvc_inferred_y_value" (unified because "loss" != "score")
# Each datapoint has its original value copied to "dvc_inferred_y_value"

Image Conversion

from dvc.render.converter.image import ImageConverter

data = {
    "plots/confusion_matrix.png": b'\x89PNG\r\n...',
}

converter = ImageConverter(
    plot_id="plots/confusion_matrix.png",
    data=data,
    properties={},
)

datapoints, props = converter.flat_datapoints("workspace")
# datapoints = [
#     {"rev": "workspace", "filename": "plots/confusion_matrix.png",
#      "src": "data:image;base64,iVBORw0KGgo..."}
# ]

Related Pages

Implements Principle

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment