Implementation:Open compass VLMEvalKit Qwen2VLAPI
| Field | Value |
|---|---|
| source | VLMEvalKit |
| domain | Vision, API_Integration |
Overview
Qwen2VLAPI provides a VLMEvalKit API adapter for Alibaba Qwen2-VL vision-language models via DashScope.
Description
Qwen2VLAPI inherits from both Qwen2VLPromptMixin and BaseAPI, combining dataset-specific prompt construction with API communication capabilities. It uses the DashScope SDK with configurable pixel resolution bounds (min_pixels, max_pixels) for image processing, and supports fine-grained generation parameters including top_p, top_k, repetition_penalty, and seed. Authentication uses the DASHSCOPE_API_KEY environment variable.
Usage
Use this adapter when evaluating Qwen2-VL vision-language models (such as qwen-vl-max-0809) through the Alibaba DashScope API.
Code Reference
- Source:
vlmeval/api/qwen_vl_api.py, Lines: L1-218 - Import:
from vlmeval.api.qwen_vl_api import Qwen2VLAPI
Signature:
class Qwen2VLAPI(Qwen2VLPromptMixin, BaseAPI):
def __init__(self, model='qwen-vl-max-0809', key=None,
min_pixels=None, max_pixels=None, max_length=1024,
top_p=0.001, top_k=1, temperature=0.01,
repetition_penalty=1.0, presence_penalty=0.0,
seed=3407, use_custom_prompt=True, **kwargs): ...
def generate_inner(self, inputs, **kwargs): ...
I/O Contract
| Direction | Description |
|---|---|
| Inputs | message — text/image/video content list; model-specific params via kwargs |
| Outputs | generate() returns str prediction; generate_inner() returns (int, str, str) tuple |
Usage Examples
# Example instantiation
model = Qwen2VLAPI(model='qwen-vl-max-0809')
response = model.generate(message)