Principle:OpenGVLab InternVL Benchmark Dispatch

Knowledge Sources	InternVL
Domains	Evaluation, Benchmarking, Vision_Language
Last Updated	2026-02-07 00:00 GMT

Overview

A unified evaluation dispatcher that routes model checkpoints to benchmark-specific evaluation scripts based on a dataset identifier.

Description

Comprehensive evaluation of vision-language models requires running many benchmarks, each with different data formats, evaluation protocols, and scoring metrics. The benchmark dispatch pattern provides a single entry point that:

Accepts a model checkpoint path and benchmark name
Routes to the appropriate evaluation script (VQA, multi-image, hallucination, etc.)
Configures distributed inference via torchrun
Passes through additional benchmark-specific arguments

This simplifies the evaluation workflow from memorizing per-benchmark commands to a single uniform interface.

Usage

Use this pattern when evaluating InternVL models across multiple benchmarks. Call the dispatcher script with a checkpoint path and benchmark name.

Theoretical Basis

The dispatch pattern is a simple conditional router:

# Pseudo-code: Benchmark dispatch
def dispatch(checkpoint, dataset, gpus=8):
    if dataset in VQA_BENCHMARKS:
        torchrun(f'eval/vqa/evaluate_vqa.py --checkpoint {checkpoint} --datasets {dataset}')
    elif dataset == 'mantis':
        torchrun(f'eval/mantis_eval/evaluate_mantis.py --checkpoint {checkpoint}')
    elif dataset == 'mmhal':
        torchrun(f'eval/mmhal/evaluate_mmhal.py --checkpoint {checkpoint}')
    elif dataset == 'mmvet':
        python(f'eval/mmvet/evaluate_mmvet.py --checkpoint {checkpoint}')
    # ... 40+ benchmark routes

Supported benchmark categories:

VQA: TextVQA, DocVQA, ChartQA, InfographicsVQA, AI2D, GQA, OKVQA, VizWiz, VQAv2, OCR-VQA
Multi-image: Mantis, MMIU, MIRB
Hallucination: MMHal
General: MM-Vet, POPE, ScienceQA, MathVista
MMMU: MMMU-val, MMMU-test, MMMU-CoT
Video: MVBench
Others: SEED, LLaVA-Bench, TinyLVLM, MMVP

Related Pages

Implemented By

Implementation:OpenGVLab_InternVL_Evaluate_Sh

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment