Implementation:Open compass VLMEvalKit Qwen2VLAPI

Field	Value
source	VLMEvalKit
domain	Vision, API_Integration

Overview

Qwen2VLAPI provides a VLMEvalKit API adapter for Alibaba Qwen2-VL vision-language models via DashScope.

Description

Qwen2VLAPI inherits from both Qwen2VLPromptMixin and BaseAPI, combining dataset-specific prompt construction with API communication capabilities. It uses the DashScope SDK with configurable pixel resolution bounds (min_pixels, max_pixels) for image processing, and supports fine-grained generation parameters including top_p, top_k, repetition_penalty, and seed. Authentication uses the DASHSCOPE_API_KEY environment variable.

Usage

Use this adapter when evaluating Qwen2-VL vision-language models (such as qwen-vl-max-0809) through the Alibaba DashScope API.

Code Reference

Source: vlmeval/api/qwen_vl_api.py, Lines: L1-218
Import: from vlmeval.api.qwen_vl_api import Qwen2VLAPI

Signature:

class Qwen2VLAPI(Qwen2VLPromptMixin, BaseAPI):
    def __init__(self, model='qwen-vl-max-0809', key=None,
                 min_pixels=None, max_pixels=None, max_length=1024,
                 top_p=0.001, top_k=1, temperature=0.01,
                 repetition_penalty=1.0, presence_penalty=0.0,
                 seed=3407, use_custom_prompt=True, **kwargs): ...
    def generate_inner(self, inputs, **kwargs): ...

I/O Contract

Direction	Description
Inputs	message — text/image/video content list; model-specific params via kwargs
Outputs	generate() returns str prediction; generate_inner() returns (int, str, str) tuple

Usage Examples

# Example instantiation
model = Qwen2VLAPI(model='qwen-vl-max-0809')
response = model.generate(message)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment