Implementation:Open compass VLMEvalKit JTVLChatWrapper
| Field | Value |
|---|---|
| source | VLMEvalKit |
| domain | Vision, API_Integration |
Overview
JTVLChatWrapper provides a VLMEvalKit API adapter for the JT-VL-Chat vision-language model from China Mobile.
Description
JTVLChatWrapper inherits from BaseAPI and communicates with the Jiutian (Nine Heavens) platform API endpoint at hl.jiutian.10086.cn. It supports custom prompt construction for MCQ datasets, handles image dumping and base64 encoding, and uses a predefined app code for authentication. The INTERLEAVE flag is set to False, indicating sequential image-text processing.
Usage
Use this adapter when evaluating JT-VL-Chat models through the China Mobile Jiutian AI platform API.
Code Reference
- Source:
vlmeval/api/jt_vl_chat.py, Lines: L1-281 - Import:
from vlmeval.api.jt_vl_chat import JTVLChatWrapper
Signature:
class JTVLChatWrapper(BaseAPI):
def __init__(self, model='jt-vl-chat', retry=5, wait=5, api_base='',
app_code='', verbose=True, system_prompt=None,
temperature=0.7, max_tokens=2048, proxy=None,
**kwargs): ...
def generate_inner(self, inputs, **kwargs): ...
I/O Contract
| Direction | Description |
|---|---|
| Inputs | message — text/image/video content list; model-specific params via kwargs |
| Outputs | generate() returns str prediction; generate_inner() returns (int, str, str) tuple |
Usage Examples
# Example instantiation
model = JTVLChatWrapper(model='jt-vl-chat')
response = model.generate(message)