Implementation:Open compass VLMEvalKit SArena Metrics

Field	Value
source	VLMEvalKit
domain	Vision, Evaluation, Image Generation, Metrics Orchestration

Overview

Orchestrates multiple image generation evaluation metrics for the SArena (InternSVG) benchmark through a configurable metrics registry.

Description

The `InternSVGMetrics` class uses a `MetricsConfig` dataclass to selectively instantiate and manage evaluation metrics including FID, FID-C, CLIP Score (T2I/I2I), DINO Score, LPIPS, SSIM, PSNR, and Token Length. The `calculate_metrics` method iterates over active metrics, computing scores for each and aggregating results. The registry pattern uses lazy initialization via lambda builders for memory-efficient metric loading.

Usage

Called internally by the corresponding dataset class during evaluation.

Code Reference

Source: vlmeval/dataset/utils/SArena/metrics.py, Lines: L1-82
Import: from vlmeval.dataset.utils.SArena.metrics import InternSVGMetrics, MetricsConfig

Key Functions:

@dataclass
class MetricsConfig: ...

class InternSVGMetrics:
    def calculate_metrics(self, batch): ...
    def reset(self): ...

I/O Contract

Direction	Description
Inputs	A batch dict containing image data ('pred_im', 'gt_im', 'caption' as applicable) and a MetricsConfig specifying which metrics to use
Outputs	Dictionary mapping metric names to their computed average scores

Usage Examples

from vlmeval.dataset.utils.SArena.metrics import InternSVGMetrics, MetricsConfig

config = MetricsConfig(use_FID=True, use_CLIP_Score_T2I=True)
metrics = InternSVGMetrics(config, tokenizer_path="path/to/tokenizer")
results = metrics.calculate_metrics(batch)

Related Pages

Principle:Open_compass_VLMEvalKit_Benchmark_Dataset_Construction

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment