Implementation:Open compass VLMEvalKit Supported VLM

Field	Value
Source	VLMEvalKit\|https://github.com/open-compass/VLMEvalKit
Domain	Vision, Model_Architecture
Last Updated	2026-02-14 00:00 GMT

Overview

Concrete tool for looking up and instantiating Vision-Language Models by name using the VLMEvalKit model registry.

Description

supported_VLM is a Python dictionary defined in vlmeval/config.py that aggregates 60+ model group dictionaries. Each entry maps a model name string to a functools.partial object wrapping the model class with default arguments.

The registry includes models across several categories:

Local VLMs — InternVL, InternVL2, LLaVA, LLaVA-OneVision, Qwen-VL, Qwen2-VL, MiniCPM-V, Phi3-Vision, Cambrian, Eagle, DeepSeek-VL, Molmo, and many more.
API Models — GPT-4o, GPT-4V, Claude 3/3.5, Gemini Pro/Ultra, Qwen-VL-Plus/Max, Step-1V, and other commercial VLM APIs.
Video Models — Video-LLaVA, VideoChat2, Chat-UniVi, LLaVA-NeXT-Video, and other video-capable VLMs.

Each functools.partial object wraps the model class constructor with pre-configured default arguments such as model_path, temperature, max_tokens, and other model-specific parameters. When called, the partial produces a fully initialized model instance ready for inference.

Usage

Use supported_VLM to instantiate any supported model by name. The returned functools.partial can be called with additional keyword arguments to override defaults:

Basic instantiation — supported_VLM[model_name]() uses all defaults.
With overrides — supported_VLM[model_name](temperature=0.5) overrides the default temperature.
Listing models — list(supported_VLM.keys()) returns all supported model names.

Code Reference

Source Location

Field	Value
Repository	VLMEvalKit
File	`vlmeval/config.py`
Lines (aggregation)	L1999-2019
Lines (video_models)	L20-48
Lines (api_models)	L126-658

Signature

supported_VLM: Dict[str, functools.partial]

Dictionary lookup and instantiation:

# Look up and call the partial to get a model instance
model_instance = supported_VLM[model_name]()

Import

from vlmeval.config import supported_VLM

I/O Contract

Direction	Type	Description
Input	`model_name` (str)	Key into the `supported_VLM` dictionary (e.g., `"InternVL2-8B"`, `"GPT4o"`)
Output	Model instance	BaseModel subclass for local models; BaseAPI subclass for API-based models
Raises	`KeyError`	If `model_name` is not found in the registry

Usage Examples

Instantiating a Local VLM

from vlmeval.config import supported_VLM

# Instantiate InternVL2-8B with default configuration
model = supported_VLM["InternVL2-8B"]()

# The model is now ready for inference
response = model.generate(["Describe this image."], ["path/to/image.jpg"])

Instantiating an API Model

from vlmeval.smp import load_env
from vlmeval.config import supported_VLM

# Load API keys first (required for API models)
load_env()

# Instantiate GPT-4o with default configuration
model = supported_VLM["GPT4o"]()

# Use the model for inference
response = model.generate(["What is shown in this image?"], ["path/to/image.jpg"])

Listing All Supported Models

from vlmeval.config import supported_VLM

# Print all available model names
for name in sorted(supported_VLM.keys()):
    print(name)

Related Pages

Principle:Open_compass_VLMEvalKit_Model_Registry

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment