Principle:Iterative Dvc Vega Lite Template Application
| Knowledge Sources | |
|---|---|
| Domains | Visualization, Template_Rendering |
| Last Updated | 2026-02-10 00:00 GMT |
Overview
Vega-Lite template application is the process of matching plot definitions to appropriate renderer types, instantiating data converters, and filling Vega-Lite templates with transformed data to produce complete chart specifications.
Description
Once data has been parsed and converted into flat datapoint lists, the next step in the visualization pipeline is to produce actual chart specifications that a rendering engine can display. Vega-Lite template application bridges the gap between transformed data and visual output by performing three coordinated tasks: grouping plot definitions across revisions, selecting the correct renderer type for each plot, and filling templates with data to produce complete chart specifications.
The grouping phase is essential because the same logical plot may have definitions across multiple revisions (when comparing experiments) and potentially across multiple configuration files. These scattered definitions must be consolidated so that all datapoints for a single logical plot are gathered into one renderer. The grouping algorithm collects all definitions that share the same plot ID, squashes their properties (with later revisions taking precedence for conflicting properties), and builds a unified datapoint list.
The renderer selection phase distinguishes between two fundamental plot types: Vega renderers for data-driven charts (line plots, scatter plots, bar charts) and Image renderers for static image files (confusion matrices, sample outputs). The selection is based on file extension matching: if the plot's inner ID matches known image extensions (PNG, JPEG, GIF, SVG, etc.), an ImageRenderer is used; otherwise, a VegaRenderer is selected.
For Vega-type plots, the template application step fills a Vega-Lite JSON template with the collected datapoints, axis configurations, and display properties. The template system supports both built-in templates (linear, confusion matrix, bar chart, etc.) and user-provided custom templates. The filled template is a complete Vega-Lite specification ready for rendering by any Vega-Lite compatible viewer.
Usage
Use Vega-Lite template application when:
- Collected plot data from multiple revisions must be grouped by plot identity and converted into renderable chart objects.
- The system must automatically distinguish between data-driven charts and image plots based on file type.
- Vega-Lite templates must be filled with data and configuration to produce complete, self-contained chart specifications.
- Error tracking per-revision and per-source is needed to provide diagnostic feedback about failed data loading or conversion.
Theoretical Basis
The matching and rendering algorithm follows a group-convert-fill pattern:
FUNCTION match_defs_renderers(data, output_dir, templates_dir):
plots_data = PlotsData(data)
renderers = []
// Phase 1: Group definitions by plot ID across all revisions
groups = group_definitions_by_id(data)
// groups: {plot_id: [(rev1, inner_id, definition), (rev2, inner_id, definition), ...]}
FOR each plot_id, group in groups:
all_datapoints = []
properties = squash_properties(group) // Later revisions override
// Phase 2: Select renderer type and convert data per revision
FOR each (revision, inner_id, definition) in group:
data_sources = infer_data_sources(inner_id, definition)
source_data = get_data_for_sources(data, data_sources, revision)
IF inner_id matches image extensions:
renderer_class = ImageRenderer
converter = ImageConverter(inner_id, source_data, properties)
ELSE:
renderer_class = VegaRenderer
converter = VegaConverter(inner_id, source_data, properties)
TRY:
datapoints, rev_props = converter.flat_datapoints(revision)
all_datapoints.extend(datapoints)
CATCH error:
record_definition_error(revision, error)
CONTINUE
// Phase 3: Instantiate renderer with filled template
IF properties has no "title":
properties["title"] = plot_id
renderer = renderer_class(all_datapoints, plot_id, **properties)
renderers.append(RendererWithErrors(renderer, src_errors, def_errors))
RETURN renderers
The PlotsData.group_definitions method performs the critical grouping step:
FUNCTION group_definitions(data):
groups = defaultdict(list)
FOR each revision, revision_content in data:
definitions = revision_content["definitions"]["data"]
FOR each config_file, file_definitions in definitions:
FOR each plot_id, definition in file_definitions:
full_id = compose_id(config_file, plot_id)
groups[full_id].append((revision, plot_id, definition))
RETURN groups
The RendererWithErrors named tuple bundles each renderer with its associated errors, enabling the output layer to report which revisions or sources failed while still rendering the data that succeeded. This error-tolerant design is important for cross-revision comparisons where some revisions may lack certain files.
The property squashing strategy uses a reverse-precedence merge: properties from the last revision in the group take highest precedence, ensuring that the most recent configuration wins when there are conflicts.