Implementation:InternLM Lmdeploy VisionConfig
| Knowledge Sources | |
|---|---|
| Domains | Vision_Language_Models, Configuration |
| Last Updated | 2026-02-07 15:00 GMT |
Overview
Concrete tool for configuring vision model processing parameters in multimodal inference pipelines provided by the LMDeploy library.
Description
The VisionConfig dataclass controls how images are processed in vision-language model inference. It is a lightweight configuration with two key parameters: max_batch_size for controlling image processing throughput and thread_safe for multi-threaded deployments.
Usage
Import this when deploying VLMs and you need to control image processing batch size or enable thread-safe mode for multi-threaded serving.
Code Reference
Source Location
- Repository: lmdeploy
- File: lmdeploy/messages.py
- Lines: L631-644
Signature
@dataclass
class VisionConfig:
max_batch_size: int = 1 # Max images per processing batch
thread_safe: bool = False # Enable thread-safe mode
Import
from lmdeploy.messages import VisionConfig
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| max_batch_size | int | No | Maximum images per batch (default: 1) |
| thread_safe | bool | No | Thread-safe mode for multi-threaded usage (default: False) |
Outputs
| Name | Type | Description |
|---|---|---|
| VisionConfig | dataclass | Configuration instance for VLM pipeline initialization |
Usage Examples
from lmdeploy import pipeline, TurbomindEngineConfig
from lmdeploy.messages import VisionConfig
# Configure for VLM with larger session for image tokens
backend_config = TurbomindEngineConfig(session_len=8192, tp=1)
pipe = pipeline('OpenGVLab/InternVL2-8B', backend_config=backend_config)