Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Open compass VLMEvalKit Dataset Base Class Hierarchy

From Leeroopedia
Field Value
source VLMEvalKit|https://github.com/open-compass/VLMEvalKit
domain Vision, Evaluation, Data_Processing
last_updated 2026-02-14 00:00 GMT

Overview

A class hierarchy that provides base implementations for different benchmark modalities (image MCQ, VQA, video) with auto-downloading, prompt building, and evaluation capabilities.

Description

VLMEvalKit organizes benchmarks into a class hierarchy: ImageBaseDataset is the root for image benchmarks, with specialized subclasses ImageMCQDataset (TYPE='MCQ'), ImageVQADataset (TYPE='VQA'), and ImageYORNDataset (TYPE='Y/N'). VideoBaseDataset extends the concept to video with frame extraction. TextBaseDataset handles text-only benchmarks.

Each base class provides:

  1. Auto-download via DATASET_URL/DATASET_MD5 class attributes and prepare_tsv()
  2. Default build_prompt() for prompt construction
  3. Abstract evaluate() for scoring
  4. dump_image() for base64 image decoding

New benchmarks subclass the appropriate base and override DATASET_URL, DATASET_MD5, and optionally build_prompt() and evaluate().

Usage

When adding a new benchmark, choose the appropriate base class based on task type: ImageMCQDataset for multiple-choice, ImageVQADataset for open-ended VQA, VideoBaseDataset for video benchmarks. Subclass it and set DATASET_URL and DATASET_MD5 class attributes.

Theoretical Basis

Template Method pattern — base classes define the skeleton (download → load → build prompt → evaluate) and subclasses fill in the specifics. The class hierarchy enforces consistent data handling across 100+ benchmarks.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment