Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Open compass VLMEvalKit GLMVisionWrapper

From Leeroopedia
Field Value
source VLMEvalKit
domain Vision, API_Integration

Overview

GLMVisionWrapper provides a VLMEvalKit API adapter for Zhipu AI GLM vision-language models.

Description

GLMVisionWrapper inherits from BaseAPI and uses the ZhipuAI Python SDK to communicate with Zhipu's chat completion API. It encodes images to base64 for transmission, adds dataset-specific prompts (e.g., yes/no guidance for HallusionBench and POPE), and supports configurable max token output. Authentication uses the GLMV_API_KEY environment variable.

Usage

Use this adapter when evaluating Zhipu GLM vision models through the ZhipuAI API (obtainable at bigmodel.cn).

Code Reference

  • Source: vlmeval/api/glm_vision.py, Lines: L1-77
  • Import: from vlmeval.api.glm_vision import GLMVisionWrapper

Signature:

class GLMVisionWrapper(BaseAPI):
    def __init__(self, model, retry=5, key=None, verbose=True,
                 system_prompt=None, max_tokens=4096, proxy=None,
                 **kwargs): ...
    def generate_inner(self, inputs, **kwargs): ...

I/O Contract

Direction Description
Inputs message — text/image/video content list; model-specific params via kwargs
Outputs generate() returns str prediction; generate_inner() returns (int, str, str) tuple

Usage Examples

# Example instantiation
model = GLMVisionWrapper(model='glm-4v')
response = model.generate(message)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment