Implementation:Open compass VLMEvalKit ChartMimic Legend Evaluator
| Field | Value |
|---|---|
| source | VLMEvalKit |
| domain | Vision, Evaluation, Chart Generation, Legend |
Overview
Evaluates legend accuracy by comparing legend objects extracted from generated and golden matplotlib code in the ChartMimic benchmark.
Description
The `LegendEvaluator` class instruments matplotlib code to log legend entries and optionally their positions. It injects prefix/suffix code to capture legend properties from matplotlib axes objects, executes the modified scripts via `run_script_safe`, and evaluates the generated legends against golden references using precision, recall, and F1 metrics. The `use_position` flag controls whether legend positioning is also evaluated.
Usage
Called internally by the corresponding dataset class during evaluation.
Code Reference
- Source:
vlmeval/dataset/utils/chartmimic/evaluator/legend_evaluator.py, Lines: L1-194 - Import:
from vlmeval.dataset.utils.chartmimic.evaluator.legend_evaluator import LegendEvaluator
Key Functions:
class LegendEvaluator:
def __call__(self, generation_code_file, golden_code_file): ...
def _log_legends(self, code_file): ...
def _calculate_metrics(self, generation_texts, golden_texts): ...
I/O Contract
| Direction | Description |
|---|---|
| Inputs | Paths to generated and golden Python code files producing matplotlib charts |
| Outputs | Metrics dict with precision, recall, and F1 scores for legend matching |
Usage Examples
from vlmeval.dataset.utils.chartmimic.evaluator.legend_evaluator import LegendEvaluator
evaluator = LegendEvaluator(use_position=True)
evaluator("generated_chart.py", "golden_chart.py")
print(evaluator.metrics)