Implementation:InternLM Lmdeploy VisionConfig

Knowledge Sources	LMDeploy VLM Pipeline
Domains	Vision_Language_Models, Configuration
Last Updated	2026-02-07 15:00 GMT

Overview

Concrete tool for configuring vision model processing parameters in multimodal inference pipelines provided by the LMDeploy library.

Description

The VisionConfig dataclass controls how images are processed in vision-language model inference. It is a lightweight configuration with two key parameters: max_batch_size for controlling image processing throughput and thread_safe for multi-threaded deployments.

Usage

Import this when deploying VLMs and you need to control image processing batch size or enable thread-safe mode for multi-threaded serving.

Code Reference

Source Location

Repository: lmdeploy
File: lmdeploy/messages.py
Lines: L631-644

Signature

@dataclass
class VisionConfig:
    max_batch_size: int = 1       # Max images per processing batch
    thread_safe: bool = False     # Enable thread-safe mode

Import

from lmdeploy.messages import VisionConfig

I/O Contract

Inputs

Name	Type	Required	Description
max_batch_size	int	No	Maximum images per batch (default: 1)
thread_safe	bool	No	Thread-safe mode for multi-threaded usage (default: False)

Outputs

Name	Type	Description
VisionConfig	dataclass	Configuration instance for VLM pipeline initialization

Usage Examples

from lmdeploy import pipeline, TurbomindEngineConfig
from lmdeploy.messages import VisionConfig

# Configure for VLM with larger session for image tokens
backend_config = TurbomindEngineConfig(session_len=8192, tp=1)

pipe = pipeline('OpenGVLab/InternVL2-8B', backend_config=backend_config)

Related Pages

Implements Principle

Principle:InternLM_Lmdeploy_VLM_Configuration

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment