Implementation:Open compass VLMEvalKit ChartMimic Legend Evaluator

Field	Value
source	VLMEvalKit
domain	Vision, Evaluation, Chart Generation, Legend

Overview

Evaluates legend accuracy by comparing legend objects extracted from generated and golden matplotlib code in the ChartMimic benchmark.

Description

The `LegendEvaluator` class instruments matplotlib code to log legend entries and optionally their positions. It injects prefix/suffix code to capture legend properties from matplotlib axes objects, executes the modified scripts via `run_script_safe`, and evaluates the generated legends against golden references using precision, recall, and F1 metrics. The `use_position` flag controls whether legend positioning is also evaluated.

Usage

Called internally by the corresponding dataset class during evaluation.

Code Reference

Source: vlmeval/dataset/utils/chartmimic/evaluator/legend_evaluator.py, Lines: L1-194
Import: from vlmeval.dataset.utils.chartmimic.evaluator.legend_evaluator import LegendEvaluator

Key Functions:

class LegendEvaluator:
    def __call__(self, generation_code_file, golden_code_file): ...
    def _log_legends(self, code_file): ...
    def _calculate_metrics(self, generation_texts, golden_texts): ...

I/O Contract

Direction	Description
Inputs	Paths to generated and golden Python code files producing matplotlib charts
Outputs	Metrics dict with precision, recall, and F1 scores for legend matching

Usage Examples

from vlmeval.dataset.utils.chartmimic.evaluator.legend_evaluator import LegendEvaluator

evaluator = LegendEvaluator(use_position=True)
evaluator("generated_chart.py", "golden_chart.py")
print(evaluator.metrics)

Related Pages

Principle:Open_compass_VLMEvalKit_Benchmark_Dataset_Construction

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment