Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Open compass VLMEvalKit Model Registry

From Leeroopedia
Field Value
Source https://github.com/open-compass/VLMEvalKit
Domain Vision, Model_Architecture
Last Updated 2026-02-14 00:00 GMT

Overview

A registry pattern that maps human-readable model names to pre-configured constructor functions for instantiating Vision-Language Models.

Description

VLMEvalKit uses a centralized dictionary called supported_VLM that maps string model names (e.g., "InternVL2-8B", "GPT4o") to functools.partial objects wrapping model class constructors with default arguments (model_path, temperature, etc.). This allows uniform model instantiation via supported_VLM[model_name]() throughout the framework.

Key characteristics of this registry pattern in VLMEvalKit:

  • Centralized aggregation — The registry aggregates entries from 60+ model group dictionaries covering local VLMs, API models, and video models.
  • Pre-configured defaults — Each entry is a functools.partial object with default constructor arguments baked in. For example, a model entry might specify model_path, max_tokens, or temperature as defaults.
  • Uniform interface — All models, whether local (InternVL, LLaVA, Qwen2-VL, MiniCPM) or API-based (GPT-4o, Claude, Gemini), are accessed through the same dictionary lookup and instantiation call.
  • Extensible — New model families are added by defining a new group dictionary and merging it into supported_VLM.

Usage

Use when selecting and instantiating a model for evaluation:

  1. The user provides a model name string (e.g., via command-line argument --model).
  2. The framework looks up the name in supported_VLM.
  3. The returned functools.partial object is called, producing a fully configured model instance.
  4. The model instance is then used for inference on benchmark datasets.

This pattern eliminates the need for users to know which class to import or what constructor arguments to provide — the registry handles all of that.

Theoretical Basis

The Registry/Factory pattern is a well-established software engineering design pattern that decouples object creation from object usage. In VLMEvalKit, this pattern manifests as:

  • Registry — A central dictionary that maps identifiers (model name strings) to constructors.
  • Factory — The functools.partial mechanism, which wraps constructors with default arguments, acting as a factory for model instances.

The pseudocode for this pattern is:

1. For each model family (InternVL, LLaVA, GPT, Claude, ...):
   a. Define a group dictionary:
      model_group = {
          "ModelName-Variant": partial(ModelClass, model_path=..., **default_args),
          ...
      }
2. Aggregate all group dictionaries:
   supported_VLM = {}
   supported_VLM.update(intern_vl_models)
   supported_VLM.update(llava_models)
   supported_VLM.update(api_models)
   ... (60+ groups)
3. At runtime:
   a. Look up: constructor = supported_VLM[model_name]
   b. Instantiate: model = constructor()  # calls partial with defaults
   c. Use: model.generate(prompt, image)

Benefits of this approach:

  • Discoverability — All supported models are enumerable via supported_VLM.keys().
  • Consistency — Every model is instantiated the same way, regardless of its underlying implementation.
  • Maintainability — Adding a new model requires only adding an entry to the appropriate group dictionary.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment