Principle:Open compass VLMEvalKit Image Inference Orchestration
| Field | Value |
|---|---|
| source | VLMEvalKit|https://github.com/open-compass/VLMEvalKit |
| domain | Vision, Evaluation, Distributed_Computing |
| last_updated | 2026-02-14 00:00 GMT |
Overview
An orchestration pattern that manages the lifecycle of VLM inference across distributed GPU ranks with checkpoint-based fault tolerance.
Description
Image inference in VLMEvalKit follows a distributed data-parallel pattern. The infer_data_job() function splits data across GPU ranks, delegates per-rank inference to infer_data(), merges results on rank 0, and handles thinking/reasoning content separation (SPLIT_THINK mode). Each rank processes its shard independently, saves intermediate results to pickle files, and supports resume from checkpoints. The pipeline also routes API models to infer_data_api() transparently.
Usage
Use for any image-based benchmark evaluation. This is the primary inference entry point called by run.py for each model x dataset combination. Supports both local VLMs (via torchrun multi-GPU) and API models (via parallel HTTP requests).
Theoretical Basis
Data parallelism pattern — divide dataset across N workers, each processes independently, merge results. Checkpoint/resume for fault tolerance.
Pseudocode:
- Split data by rank
- Infer per shard
- Barrier sync
- Rank 0 merges
- Save final result file