Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Open compass VLMEvalKit LongVITA

From Leeroopedia
Revision as of 13:29, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Open_compass_VLMEvalKit_LongVITA.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Field Value
source VLMEvalKit
domain Vision, Model_Architecture

Overview

VLM adapter for the Long-VITA model enabling benchmark evaluation in VLMEvalKit.

Description

LongVITA inherits from BaseModel and wraps the Long-VITA model for use within the VLMEvalKit evaluation framework. It initializes the model and tokenizer/processor from a HuggingFace model path (default: VITA-MLLM/Long-VITA-16K_HF) and provides the generate_inner method for inference. Extends LongVITAWrapper which provides core image and video processing logic with dynamic resolution support.

Usage

Register in vlmeval/config.py via supported_VLM and invoke through the standard evaluation pipeline.

Code Reference

  • Source: vlmeval/vlm/long_vita.py, Lines: L1-817
  • Import: from vlmeval.vlm.long_vita import LongVITA

Signature:

class LongVITA(BaseModel):
    INSTALL_REQ = False
    INTERLEAVE = True
    def __init__(self, model_path='VITA-MLLM/Long-VITA-16K_HF', **kwargs): ...
    def generate_inner(self, message, dataset=None): ...

I/O Contract

Direction Description
Inputs message — list of dicts with type (text/image) and value; dataset — optional dataset name for custom prompting
Outputs generate_inner() returns str (model response text)

Usage Examples

from vlmeval.vlm.long_vita import LongVITA
model = LongVITA(model_path='path/to/model')
response = model.generate_inner(message)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment