Implementation:InternLM Lmdeploy Autotest Config
| Knowledge Sources | |
|---|---|
| Domains | Testing, Configuration, CI |
| Last Updated | 2026-02-07 15:00 GMT |
Overview
A YAML configuration file that defines the complete test matrix for lmdeploy automated testing, specifying model paths, tensor parallelism settings, backend-specific model lists, quantization exclusions, and benchmark/evaluation model selections.
Description
The autotest/config.yml file serves as the central configuration for the lmdeploy autotest framework. It organizes the test matrix across multiple dimensions:
- Global paths: Defines paths for models (
/nvme/qa_test_models), resources, logs, server logs, evaluation reports, benchmark reports, and datasets (ShareGPT). - Tensor parallelism (TP) config: Maps model identifiers to their required TP values (e.g., Qwen3-235B-A22B requires TP=8, InternVL3-38B requires TP=2).
- TurboMind chat models: Lists all models tested with the TurboMind backend, including Llama, InternLM, InternVL, Qwen, Mistral, DeepSeek, CodeLlama, GLM, MiniCPM, and others.
- PyTorch chat models: Lists all models tested with the PyTorch backend, including additional models like Gemma, Phi, and deepseek-moe variants.
- VL (Vision-Language) models: Separate lists for TurboMind and PyTorch backends covering multimodal models (InternVL, Qwen-VL, LLaVA, CogVLM, MiniCPM-V, Phi-3-vision).
- Base models: Lists for completion-only models without chat templates.
- Quantization config: Specifies exclusion lists for quantization tests:
no_awq: Models that cannot be AWQ-quantizedgptq: Models tested with GPTQno_kvint4/no_kvint8: Models excluded from KV-cache quantization
- Benchmark models: Models selected for performance benchmarking.
- Evaluate models: Models selected for accuracy evaluation with OpenCompass.
- MLLM evaluate models: Multimodal models selected for vision-language evaluation.
The configuration targets the A100 environment (env_tag: a100).
Usage
Consumed by the pytest-based autotest framework to dynamically generate test cases based on model-backend-quantization combinations.
Code Reference
Source Location
- Repository: InternLM_Lmdeploy
- File: autotest/config.yml
- Lines: 1-435
Signature
model_path: /nvme/qa_test_models
resource_path: /nvme/qa_test_models/resource
log_path: /nvme/qa_test_models/autotest_log
env_tag: a100
device: cuda
config:
tp:
meta-llama/Meta-Llama-3-1-70B-Instruct: 4
internlm/Intern-S1: 8
Qwen/Qwen3-235B-A22B: 8
# ...
turbomind_chat_model:
tp:
- meta-llama/Meta-Llama-3-1-8B-Instruct
- internlm/internlm3-8b-instruct
# ...
pytorch_chat_model:
tp:
- meta-llama/Llama-4-Scout-17B-16E-Instruct
# ...
turbomind_quantization:
no_awq: [...]
gptq: [...]
no_kvint4: [...]
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| YAML config file | file | Yes | The config.yml file itself, loaded by the autotest framework |
Outputs
| Name | Type | Description |
|---|---|---|
| Test matrix | dict | Parsed configuration consumed by pytest fixtures to generate test cases |
| Model lists | lists | Per-backend lists of models to test |
| TP mappings | dict | Tensor parallelism requirements per model |
| Quantization exclusions | dict | Models to skip for specific quantization methods |
Usage Examples
import yaml
with open('autotest/config.yml', 'r') as f:
config = yaml.safe_load(f)
# Get all turbomind chat models
turbomind_models = config['turbomind_chat_model']['tp']
# Get TP setting for a specific model
tp = config['config']['tp'].get('Qwen/Qwen3-235B-A22B', 1)
# Get models excluded from AWQ quantization
no_awq = config['turbomind_quantization']['no_awq']