Implementation:Open compass VLMEvalKit TaiyiWrapper
| Field | Value |
|---|---|
| source | VLMEvalKit |
| domain | Vision, API_Integration |
Overview
TaiyiWrapper provides a VLMEvalKit API adapter for Megvii's Taiyi vision-language model.
Description
TaiyiWrapper inherits from BaseAPI and communicates with the Taiyi chat completions endpoint at taiyi.megvii.com. It supports custom prompt construction for Y/N, MCQ, and VQA dataset types, handles image-first reordering for single-image inputs, and encodes images to base64 for API transmission. Authentication uses the TAIYI_API_KEY environment variable.
Usage
Use this adapter when evaluating Megvii Taiyi vision-language models through the Taiyi API.
Code Reference
- Source:
vlmeval/api/taiyi.py, Lines: L1-185 - Import:
from vlmeval.api.taiyi import TaiyiWrapper
Signature:
class TaiyiWrapper(BaseAPI):
def __init__(self, model='taiyi', retry=5, key=None, verbose=False,
system_prompt=None, temperature=0, timeout=60,
url='https://taiyi.megvii.com/v1/chat/completions',
max_tokens=1024, **kwargs): ...
def generate_inner(self, inputs, **kwargs): ...
I/O Contract
| Direction | Description |
|---|---|
| Inputs | message — text/image/video content list; model-specific params via kwargs |
| Outputs | generate() returns str prediction; generate_inner() returns (int, str, str) tuple |
Usage Examples
# Example instantiation
model = TaiyiWrapper(model='taiyi')
response = model.generate(message)