Principle:Open compass VLMEvalKit Benchmark Dataset Construction

Field	Value
Source	https://github.com/open-compass/VLMEvalKit
Domain	Vision, Evaluation, Data_Processing
Last Updated	2026-02-14 00:00 GMT

Overview

A factory pattern that resolves benchmark dataset names to fully initialized dataset objects with auto-downloaded data and configured evaluation methods.

Description

VLMEvalKit maintains a registry of 100+ benchmark datasets across multiple modalities:

Image MCQ — Multiple-choice question benchmarks (MMBench, SEEDBench, ScienceQA, AI2D, etc.)
Image VQA — Visual question answering benchmarks (TextVQA, ChartQA, DocVQA, OCRBench, etc.)
Video — Video understanding benchmarks (MVBench, Video-MME, EgoSchema, etc.)
Text — Text-only reasoning benchmarks used as baselines

The build_dataset() factory function takes a dataset name string, looks it up across registered dataset classes, and returns a fully initialized dataset object. The construction process involves:

Name resolution — The factory searches through registered dataset classes (IMAGE_DATASET, VIDEO_DATASET, TEXT_DATASET, CUSTOM_DATASET) to find which class supports the given dataset name.
Automatic data downloading — Each dataset class declares DATASET_URL and DATASET_MD5 class attributes. On first use, the data is automatically downloaded and cached locally, with MD5 verification ensuring data integrity.
Initialization — The dataset class constructor loads the data (typically a TSV file) into a Pandas DataFrame, configures evaluation-specific settings, and prepares the dataset for inference.
Fallback logic — For unregistered or custom datasets, the factory supports loading local TSV files and wrapping them in generic CustomMCQDataset or CustomVQADataset classes based on column presence.

Usage

Use when selecting a benchmark for evaluation:

The dataset name string (e.g., "MMBench_DEV_EN_V11", "AI2D_TEST") is passed to build_dataset().
The factory handles downloading, caching, integrity verification, and initialization.
The returned dataset object provides a uniform interface for iteration, inference, and evaluation.

This pattern ensures that users do not need to manually download data, configure paths, or know which class implements a specific benchmark — the factory handles all of this from a single name string.

Theoretical Basis

The Abstract Factory pattern is a creational design pattern that provides an interface for creating families of related objects without specifying their concrete classes. In VLMEvalKit, this manifests as:

Self-registration — Dataset classes register themselves via class attributes (DATASET_URL, DATASET_MD5, and supported dataset names). The factory does not need to be updated when new datasets are added.
Abstract interface — All dataset classes share a common interface: .data (DataFrame), .dataset_name (str), .TYPE (str), and .evaluate() method.
Lazy downloading — Data is only downloaded on first access, reducing startup time for users who only need a subset of benchmarks.

The pseudocode for this pattern is:

1. Define dataset classes with class-level attributes:
   class MMBenchDataset:
       DATASET_URL = {"MMBench_DEV_EN_V11": "https://..."}
       DATASET_MD5 = {"MMBench_DEV_EN_V11": "abc123..."}

2. Register all dataset classes:
   DATASET_CLASSES = IMAGE_DATASET + VIDEO_DATASET + TEXT_DATASET + CUSTOM_DATASET

3. build_dataset(dataset_name):
   a. Check supported_video_datasets first
   b. For each cls in DATASET_CLASSES:
      - If dataset_name in cls.DATASET_URL or cls supports dataset_name:
        - Download data if not cached (verify MD5)
        - Return cls(dataset_name)
   c. Fallback: Try loading local TSV, wrap in CustomMCQ/VQADataset
   d. Return None if all resolution fails

This design provides:

Extensibility — New benchmarks are added by creating a new dataset class with the appropriate class attributes.
Consistency — All datasets are accessed through the same factory interface.
Reliability — MD5 checksums ensure data integrity, and caching avoids redundant downloads.

Related Pages

Implementation:Open_compass_VLMEvalKit_Build_Dataset

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment