Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:InternLM Lmdeploy Autotest Config

From Leeroopedia


Knowledge Sources
Domains Testing, Configuration, CI
Last Updated 2026-02-07 15:00 GMT

Overview

A YAML configuration file that defines the complete test matrix for lmdeploy automated testing, specifying model paths, tensor parallelism settings, backend-specific model lists, quantization exclusions, and benchmark/evaluation model selections.

Description

The autotest/config.yml file serves as the central configuration for the lmdeploy autotest framework. It organizes the test matrix across multiple dimensions:

  • Global paths: Defines paths for models (/nvme/qa_test_models), resources, logs, server logs, evaluation reports, benchmark reports, and datasets (ShareGPT).
  • Tensor parallelism (TP) config: Maps model identifiers to their required TP values (e.g., Qwen3-235B-A22B requires TP=8, InternVL3-38B requires TP=2).
  • TurboMind chat models: Lists all models tested with the TurboMind backend, including Llama, InternLM, InternVL, Qwen, Mistral, DeepSeek, CodeLlama, GLM, MiniCPM, and others.
  • PyTorch chat models: Lists all models tested with the PyTorch backend, including additional models like Gemma, Phi, and deepseek-moe variants.
  • VL (Vision-Language) models: Separate lists for TurboMind and PyTorch backends covering multimodal models (InternVL, Qwen-VL, LLaVA, CogVLM, MiniCPM-V, Phi-3-vision).
  • Base models: Lists for completion-only models without chat templates.
  • Quantization config: Specifies exclusion lists for quantization tests:
    • no_awq: Models that cannot be AWQ-quantized
    • gptq: Models tested with GPTQ
    • no_kvint4/no_kvint8: Models excluded from KV-cache quantization
  • Benchmark models: Models selected for performance benchmarking.
  • Evaluate models: Models selected for accuracy evaluation with OpenCompass.
  • MLLM evaluate models: Multimodal models selected for vision-language evaluation.

The configuration targets the A100 environment (env_tag: a100).

Usage

Consumed by the pytest-based autotest framework to dynamically generate test cases based on model-backend-quantization combinations.

Code Reference

Source Location

Signature

model_path: /nvme/qa_test_models
resource_path: /nvme/qa_test_models/resource
log_path: /nvme/qa_test_models/autotest_log
env_tag: a100
device: cuda

config:
    tp:
        meta-llama/Meta-Llama-3-1-70B-Instruct: 4
        internlm/Intern-S1: 8
        Qwen/Qwen3-235B-A22B: 8
        # ...

turbomind_chat_model:
    tp:
        - meta-llama/Meta-Llama-3-1-8B-Instruct
        - internlm/internlm3-8b-instruct
        # ...

pytorch_chat_model:
    tp:
        - meta-llama/Llama-4-Scout-17B-16E-Instruct
        # ...

turbomind_quantization:
    no_awq: [...]
    gptq: [...]
    no_kvint4: [...]

I/O Contract

Inputs

Name Type Required Description
YAML config file file Yes The config.yml file itself, loaded by the autotest framework

Outputs

Name Type Description
Test matrix dict Parsed configuration consumed by pytest fixtures to generate test cases
Model lists lists Per-backend lists of models to test
TP mappings dict Tensor parallelism requirements per model
Quantization exclusions dict Models to skip for specific quantization methods

Usage Examples

import yaml

with open('autotest/config.yml', 'r') as f:
    config = yaml.safe_load(f)

# Get all turbomind chat models
turbomind_models = config['turbomind_chat_model']['tp']

# Get TP setting for a specific model
tp = config['config']['tp'].get('Qwen/Qwen3-235B-A22B', 1)

# Get models excluded from AWQ quantization
no_awq = config['turbomind_quantization']['no_awq']

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment