Implementation:Open compass VLMEvalKit Supported VLM
| Field | Value |
|---|---|
| Source | VLMEvalKit|https://github.com/open-compass/VLMEvalKit |
| Domain | Vision, Model_Architecture |
| Last Updated | 2026-02-14 00:00 GMT |
Overview
Concrete tool for looking up and instantiating Vision-Language Models by name using the VLMEvalKit model registry.
Description
supported_VLM is a Python dictionary defined in vlmeval/config.py that aggregates 60+ model group dictionaries. Each entry maps a model name string to a functools.partial object wrapping the model class with default arguments.
The registry includes models across several categories:
- Local VLMs — InternVL, InternVL2, LLaVA, LLaVA-OneVision, Qwen-VL, Qwen2-VL, MiniCPM-V, Phi3-Vision, Cambrian, Eagle, DeepSeek-VL, Molmo, and many more.
- API Models — GPT-4o, GPT-4V, Claude 3/3.5, Gemini Pro/Ultra, Qwen-VL-Plus/Max, Step-1V, and other commercial VLM APIs.
- Video Models — Video-LLaVA, VideoChat2, Chat-UniVi, LLaVA-NeXT-Video, and other video-capable VLMs.
Each functools.partial object wraps the model class constructor with pre-configured default arguments such as model_path, temperature, max_tokens, and other model-specific parameters. When called, the partial produces a fully initialized model instance ready for inference.
Usage
Use supported_VLM to instantiate any supported model by name. The returned functools.partial can be called with additional keyword arguments to override defaults:
- Basic instantiation —
supported_VLM[model_name]()uses all defaults. - With overrides —
supported_VLM[model_name](temperature=0.5)overrides the default temperature. - Listing models —
list(supported_VLM.keys())returns all supported model names.
Code Reference
Source Location
| Field | Value |
|---|---|
| Repository | VLMEvalKit |
| File | vlmeval/config.py
|
| Lines (aggregation) | L1999-2019 |
| Lines (video_models) | L20-48 |
| Lines (api_models) | L126-658 |
Signature
supported_VLM: Dict[str, functools.partial]
Dictionary lookup and instantiation:
# Look up and call the partial to get a model instance
model_instance = supported_VLM[model_name]()
Import
from vlmeval.config import supported_VLM
I/O Contract
| Direction | Type | Description |
|---|---|---|
| Input | model_name (str) |
Key into the supported_VLM dictionary (e.g., "InternVL2-8B", "GPT4o")
|
| Output | Model instance | BaseModel subclass for local models; BaseAPI subclass for API-based models |
| Raises | KeyError |
If model_name is not found in the registry
|
Usage Examples
Instantiating a Local VLM
from vlmeval.config import supported_VLM
# Instantiate InternVL2-8B with default configuration
model = supported_VLM["InternVL2-8B"]()
# The model is now ready for inference
response = model.generate(["Describe this image."], ["path/to/image.jpg"])
Instantiating an API Model
from vlmeval.smp import load_env
from vlmeval.config import supported_VLM
# Load API keys first (required for API models)
load_env()
# Instantiate GPT-4o with default configuration
model = supported_VLM["GPT4o"]()
# Use the model for inference
response = model.generate(["What is shown in this image?"], ["path/to/image.jpg"])
Listing All Supported Models
from vlmeval.config import supported_VLM
# Print all available model names
for name in sorted(supported_VLM.keys()):
print(name)