Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Open compass VLMEvalKit SArena Mini Utils

From Leeroopedia
Revision as of 13:32, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Open_compass_VLMEvalKit_SArena_Mini_Utils.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Field Value
source VLMEvalKit
domain Vision, Evaluation, SVG Generation, Image Quality

Overview

Provides comprehensive evaluation utilities for the SArena-Mini SVG generation benchmark, including task configuration, data management, and multi-metric scoring.

Description

This module implements the SArena-Mini evaluation infrastructure with task definitions across Icon, Illustration, and Chart categories covering understanding, generation (T2SVG, I2SVG), and editing operations (color, crop, flip, opacity, outline, rotate, scale, translate). It manages source data retrieval from HuggingFace with MD5 verification (SARENA_ZIP_MD5), defines TASK_CONFIGS tuples specifying category/task/file mappings, and integrates InternSVGMetrics for quality assessment. The evaluation uses CLIP, Sentence-BERT, LPIPS, and BERTScore models for multi-dimensional SVG quality measurement.

Usage

Called internally by the SArena-Mini dataset class during SVG evaluation.

Code Reference

  • Source: vlmeval/dataset/utils/sarena_mini.py, Lines: L1-668
  • Import: from vlmeval.dataset.utils.sarena_mini import TASK_CONFIGS, SARENA_ROOT

Key Functions:

# Configuration constants
SARENA_ROOT = os.path.join(LMUDataRoot(), "SArena_MINI_SrcData")
TASK_CONFIGS = [("SArena-Icon", "Understanding", "Icon/understanding/sarena_un.jsonl", False), ...]

def load(file_path): ...
def dump(data, file_path): ...

I/O Contract

Direction Description
Inputs SVG generation predictions; reference SVG/image files; task configuration specifying evaluation type
Outputs Multi-metric quality scores including CLIP similarity, LPIPS distance, BERTScore, and Sentence-BERT similarity

Usage Examples

# Internal usage example
from vlmeval.dataset.utils.sarena_mini import TASK_CONFIGS, SARENA_ROOT
for category, task, file_path, needs_image in TASK_CONFIGS:
    print(f"{category}/{task}: {file_path}")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment