Principle:InternLM Lmdeploy Backend Auto Selection

Knowledge Sources	LMDeploy Backend Selection LMDeploy
Domains	LLM_Inference, Architecture_Detection
Last Updated	2026-02-07 15:00 GMT

Overview

An automatic detection mechanism that selects the optimal inference backend (TurboMind or PyTorch) based on model architecture, quantization format, and hardware constraints.

Description

Backend Auto Selection solves the problem of routing models to the correct inference engine without requiring users to know the internal capabilities of each backend. The decision logic considers:

Model architecture: TurboMind supports a curated list of architectures (LLaMA, InternLM, Qwen, Mistral, etc.); unsupported models fall back to PyTorch
Quantization format: AWQ/GPTQ models use TurboMind; SmoothQuant models require PyTorch
Hardware platform: Non-CUDA platforms (Ascend, Cambricon) must use PyTorch
Vision-language models: VLMs are detected via architecture class names and use VLAsyncEngine

The system reads the model's HuggingFace config to extract the architecture class, then looks up a mapping to determine backend support.

Usage

This happens automatically during pipeline initialization. Override it by explicitly passing a backend_config of the desired type (TurbomindEngineConfig or PytorchEngineConfig).

Theoretical Basis

Backend selection uses a Strategy Pattern with architecture-based dispatch:

# Abstract selection algorithm
def select_backend(model_config, user_config):
    arch = model_config.architectures[0]
    if user_config is TurbomindEngineConfig:
        return 'turbomind'
    if user_config is PytorchEngineConfig:
        return 'pytorch'
    if arch in TURBOMIND_SUPPORTED:
        return 'turbomind'
    return 'pytorch'  # fallback

Related Pages

Implemented By

Implementation:InternLM_Lmdeploy_Autoget_Backend

Uses Heuristic

Heuristic:InternLM_Lmdeploy_Backend_Selection_Strategy

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment