Implementation:Open compass VLMEvalKit JTVLChatWrapper

Field	Value
source	VLMEvalKit
domain	Vision, API_Integration

Overview

JTVLChatWrapper provides a VLMEvalKit API adapter for the JT-VL-Chat vision-language model from China Mobile.

Description

JTVLChatWrapper inherits from BaseAPI and communicates with the Jiutian (Nine Heavens) platform API endpoint at hl.jiutian.10086.cn. It supports custom prompt construction for MCQ datasets, handles image dumping and base64 encoding, and uses a predefined app code for authentication. The INTERLEAVE flag is set to False, indicating sequential image-text processing.

Usage

Use this adapter when evaluating JT-VL-Chat models through the China Mobile Jiutian AI platform API.

Code Reference

Source: vlmeval/api/jt_vl_chat.py, Lines: L1-281
Import: from vlmeval.api.jt_vl_chat import JTVLChatWrapper

Signature:

class JTVLChatWrapper(BaseAPI):
    def __init__(self, model='jt-vl-chat', retry=5, wait=5, api_base='',
                 app_code='', verbose=True, system_prompt=None,
                 temperature=0.7, max_tokens=2048, proxy=None,
                 **kwargs): ...
    def generate_inner(self, inputs, **kwargs): ...

I/O Contract

Direction	Description
Inputs	message — text/image/video content list; model-specific params via kwargs
Outputs	generate() returns str prediction; generate_inner() returns (int, str, str) tuple

Usage Examples

# Example instantiation
model = JTVLChatWrapper(model='jt-vl-chat')
response = model.generate(message)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment