Principle:Open compass VLMEvalKit Generate Inner Interface

Field	Value
source	VLMEvalKit\|https://github.com/open-compass/VLMEvalKit
domain	Vision, Model_Architecture, Software_Design
last_updated	2026-02-14 00:00 GMT

Overview

An abstract method interface that model adapters must implement to perform single-turn VLM inference on preprocessed multimodal input.

Description

The generate_inner() method is the core extension point for VLM adapters in VLMEvalKit. For local models (BaseModel subclasses), it receives a list of message dicts (already validated and preprocessed by generate()) and must return a prediction string. For API models (BaseAPI subclasses), it receives the same input but must return a (ret_code, answer, log) tuple. This separation of concerns means adapter authors only need to focus on the model-specific inference logic — input validation, retry logic, and output processing are handled by the base classes.

Usage

Implement this method in every new VLM adapter. For BaseModel: return a string. For BaseAPI: return a (int, str, str) tuple where ret_code 0 means success.

Theoretical Basis

Template Method pattern — the base class generate() defines the overall algorithm and delegates the variable step to generate_inner(). This ensures consistent preprocessing and error handling across all adapters.

Related Pages

Implementation:Open_compass_VLMEvalKit_Generate_Inner

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment