Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Open compass VLMEvalKit Supported VLM

From Leeroopedia
Revision as of 13:32, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Open_compass_VLMEvalKit_Supported_VLM.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Field Value
Source VLMEvalKit|https://github.com/open-compass/VLMEvalKit
Domain Vision, Model_Architecture
Last Updated 2026-02-14 00:00 GMT

Overview

Concrete tool for looking up and instantiating Vision-Language Models by name using the VLMEvalKit model registry.

Description

supported_VLM is a Python dictionary defined in vlmeval/config.py that aggregates 60+ model group dictionaries. Each entry maps a model name string to a functools.partial object wrapping the model class with default arguments.

The registry includes models across several categories:

  • Local VLMs — InternVL, InternVL2, LLaVA, LLaVA-OneVision, Qwen-VL, Qwen2-VL, MiniCPM-V, Phi3-Vision, Cambrian, Eagle, DeepSeek-VL, Molmo, and many more.
  • API Models — GPT-4o, GPT-4V, Claude 3/3.5, Gemini Pro/Ultra, Qwen-VL-Plus/Max, Step-1V, and other commercial VLM APIs.
  • Video Models — Video-LLaVA, VideoChat2, Chat-UniVi, LLaVA-NeXT-Video, and other video-capable VLMs.

Each functools.partial object wraps the model class constructor with pre-configured default arguments such as model_path, temperature, max_tokens, and other model-specific parameters. When called, the partial produces a fully initialized model instance ready for inference.

Usage

Use supported_VLM to instantiate any supported model by name. The returned functools.partial can be called with additional keyword arguments to override defaults:

  • Basic instantiationsupported_VLM[model_name]() uses all defaults.
  • With overridessupported_VLM[model_name](temperature=0.5) overrides the default temperature.
  • Listing modelslist(supported_VLM.keys()) returns all supported model names.

Code Reference

Source Location

Field Value
Repository VLMEvalKit
File vlmeval/config.py
Lines (aggregation) L1999-2019
Lines (video_models) L20-48
Lines (api_models) L126-658

Signature

supported_VLM: Dict[str, functools.partial]

Dictionary lookup and instantiation:

# Look up and call the partial to get a model instance
model_instance = supported_VLM[model_name]()

Import

from vlmeval.config import supported_VLM

I/O Contract

Direction Type Description
Input model_name (str) Key into the supported_VLM dictionary (e.g., "InternVL2-8B", "GPT4o")
Output Model instance BaseModel subclass for local models; BaseAPI subclass for API-based models
Raises KeyError If model_name is not found in the registry

Usage Examples

Instantiating a Local VLM

from vlmeval.config import supported_VLM

# Instantiate InternVL2-8B with default configuration
model = supported_VLM["InternVL2-8B"]()

# The model is now ready for inference
response = model.generate(["Describe this image."], ["path/to/image.jpg"])

Instantiating an API Model

from vlmeval.smp import load_env
from vlmeval.config import supported_VLM

# Load API keys first (required for API models)
load_env()

# Instantiate GPT-4o with default configuration
model = supported_VLM["GPT4o"]()

# Use the model for inference
response = model.generate(["What is shown in this image?"], ["path/to/image.jpg"])

Listing All Supported Models

from vlmeval.config import supported_VLM

# Print all available model names
for name in sorted(supported_VLM.keys()):
    print(name)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment