Principle:Open compass VLMEvalKit API Model Inference

Field	Value
source	Repo
domain	Vision, Evaluation, API_Integration

Overview

A parallel execution pattern for running inference against commercial VLM APIs with progress tracking and retry-based fault tolerance.

Description

API model inference in VLMEvalKit uses thread-based parallelism to make concurrent HTTP requests to commercial VLM endpoints (GPT-4o, Claude, Gemini, etc.). The infer_data_api() function builds prompts for all dataset samples, then dispatches them to track_progress_rich() which manages a ThreadPoolExecutor. Results are incrementally saved to a pickle file for fault tolerance. The system supports resume from partial results and can filter out failed API responses.

Usage

Use when evaluating API-based models. The parallel execution is controlled by api_nproc (default 4 threads). Higher values increase throughput but may hit rate limits.

Theoretical Basis

Thread-pool parallelism for I/O-bound tasks. Each API call is independent, making this embarrassingly parallel. Progress tracking with incremental saves provides fault tolerance.

Pseudocode:

Build all prompts
Filter already-completed samples
Submit to thread pool
Save results incrementally

Related Pages

Implementation:Open_compass_VLMEvalKit_Infer_Data_API

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment