Implementation:Open compass VLMEvalKit VideoChatOnlineV2Wrapper
| Field | Value |
|---|---|
| source | VLMEvalKit |
| domain | Vision, API_Integration |
Overview
VideoChatOnlineV2Wrapper provides a VLMEvalKit API adapter for the VideoChat Online V2 vision-language model.
Description
VideoChatOnlineV2Wrapper inherits from BaseAPI and communicates with a configurable API endpoint for the VideoChat Online V2 model. It supports custom prompt construction for MCQ and VQA dataset types (excluding MMMU), handles image dumping and base64 encoding, and includes configurable URL and key parameters. The INTERLEAVE flag is set to False, indicating sequential image-text processing.
Usage
Use this adapter when evaluating VideoChat Online V2 models through their API endpoint.
Code Reference
- Source:
vlmeval/api/video_chat_online_v2.py, Lines: L1-282 - Import:
from vlmeval.api.video_chat_online_v2 import VideoChatOnlineV2Wrapper
Signature:
class VideoChatOnlineV2Wrapper(BaseAPI):
def __init__(self, model='VideoChatOnlineV2', retry=5, wait=5,
url='', key='', verbose=True, system_prompt=None,
temperature=0.7, max_tokens=2048, proxy=None,
**kwargs): ...
def generate_inner(self, inputs, **kwargs): ...
I/O Contract
| Direction | Description |
|---|---|
| Inputs | message — text/image/video content list; model-specific params via kwargs |
| Outputs | generate() returns str prediction; generate_inner() returns (int, str, str) tuple |
Usage Examples
# Example instantiation
model = VideoChatOnlineV2Wrapper(model='VideoChatOnlineV2')
response = model.generate(message)