Implementation:InternLM Lmdeploy Daily Ete Test H800
| Knowledge Sources | |
|---|---|
| Domains | CI, Testing, GPU |
| Last Updated | 2026-02-07 15:00 GMT |
Overview
A GitHub Actions workflow that runs daily end-to-end tests for lmdeploy on NVIDIA H800 GPUs, targeting large-scale model testing with multi-GPU configurations.
Description
The daily_ete_test_h800.yml workflow is a hardware-specific variant designed for H800 GPUs. It runs on self-hosted runners labeled h800-r1 and uses the Docker image m.daocloud.io/docker.io/openmmlab/lmdeploy:latest-cu12.8 (pulled from a Chinese mirror registry).
The workflow includes:
- linux-build: Builds the lmdeploy wheel on ubuntu-latest with CUDA 12.8.
- download_pkgs: Downloads and stages build artifacts on the H800 runner. Mounts multiple NVMe volumes for model storage.
- test_tools: Matrix job covering backends x models x functions. Runs multi-GPU tests including 1, 2, 4, and 8 GPU configurations with the
not othermarker filter. This is the only daily test variant that routinely exercises 8-GPU parallelism. - test_restful: Tests the RESTful API with Intern-S1 model at TP=8 (8 GPUs), validating large model serving with a 15-minute startup wait.
- get_coverage_report: Combines coverage data from all test jobs.
Key differences from other variants: no quantization test job, uses Chinese Docker registry mirror, tests large models requiring 8 GPUs (e.g., Intern-S1), and the default regression functions are limited to ['tools','restful'].
Usage
Triggered automatically on weekdays at 14:00 UTC or manually via GitHub Actions dispatch. Primarily used for validating large model support on high-end GPU hardware.
Code Reference
Source Location
- Repository: InternLM_Lmdeploy
- File: .github/workflows/daily_ete_test_h800.yml
- Lines: 1-355
Signature
name: daily_ete_test_h800
on:
workflow_dispatch:
inputs:
repo_org: { type: string, default: 'InternLM/lmdeploy' }
repo_ref: { type: string, default: 'main' }
backend: { type: string, default: "['turbomind', 'pytorch']" }
model: { type: string, default: "['llm','mllm']" }
function: { type: string, default: '["pipeline", "restful", "chat"]' }
offline_mode: { type: boolean, default: false }
regression_func: { type: string, default: "['tools','restful']" }
schedule:
- cron: '00 14 * * 0-4'
jobs:
linux-build: ...
download_pkgs: ...
test_tools: ...
test_restful: ...
get_coverage_report: ...
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| repo_org | string | No | Repository organization/name (default: InternLM/lmdeploy) |
| repo_ref | string | No | Branch, tag, or commit ID (default: main) |
| backend | string | Yes | JSON list of backends to test |
| model | string | Yes | JSON list of model types: llm, mllm |
| function | string | Yes | JSON list of test functions |
| offline_mode | boolean | Yes | Whether to use pre-prepared offline packages |
| regression_func | string | Yes | JSON list: tools, restful |
Outputs
| Name | Type | Description |
|---|---|---|
| Build artifacts | wheel file | lmdeploy wheel package for CUDA 12.8 |
| Test reports | Allure reports | Test results on H800 hardware |
| Coverage report | XML/text | Combined code coverage report |
Usage Examples
# Manual trigger to test large model restful API on H800:
# Go to Actions > daily_ete_test_h800 > Run workflow
# Set regression_func: "['restful']"
# This will test Intern-S1 at TP=8 on 8 H800 GPUs