Implementation:InternLM Lmdeploy Daily Ete Test 5080
| Knowledge Sources | |
|---|---|
| Domains | CI, Testing, GPU |
| Last Updated | 2026-02-07 15:00 GMT |
Overview
A GitHub Actions workflow that runs daily end-to-end tests for lmdeploy on NVIDIA RTX 5080 GPUs, covering quantization, tool functionality, and RESTful API interfaces.
Description
The daily_ete_test_5080.yml workflow is a hardware-specific variant targeting RTX 5080 GPUs. It runs on self-hosted runners labeled 5080-r1 and uses the openmmlab/lmdeploy:latest-cu12.8 Docker image.
The workflow includes:
- linux-build: Builds the lmdeploy wheel on ubuntu-latest with CUDA 12.8.
- download_pkgs: Downloads and stages build artifacts on the 5080 runner.
- test_quantization: Tests AWQ and W8A8 quantization. Includes a CUDA availability retry loop (up to 10 attempts) to handle GPU initialization issues.
- test_tools: Matrix job for backends x models x functions with single-GPU tests (
gpu_num_1andtest_3090markers). - test_restful: Tests RESTful API with Llama-3.2-3B-Instruct and Qwen3-4B models at TP=1, including log probability mode support.
- get_coverage_report: Combines coverage data from all test jobs.
A notable feature of this workflow is the CUDA availability check with retry logic in the Check env step, which polls lmdeploy check_env up to 10 times before proceeding, addressing potential GPU initialization delays on 5080 hardware.
Usage
Triggered automatically on weekdays at 14:00 UTC or manually via GitHub Actions dispatch. Default regression functions include quant, tools, and restful.
Code Reference
Source Location
- Repository: InternLM_Lmdeploy
- File: .github/workflows/daily_ete_test_5080.yml
- Lines: 1-454
Signature
name: daily_ete_test_5080
on:
workflow_dispatch:
inputs:
repo_org: { type: string, default: 'InternLM/lmdeploy' }
repo_ref: { type: string, default: 'main' }
backend: { type: string, default: "['turbomind', 'pytorch']" }
model: { type: string, default: "['llm','mllm']" }
function: { type: string, default: '["pipeline", "restful", "chat"]' }
offline_mode: { type: boolean, default: false }
regression_func: { type: string, default: "['quant', 'tools', 'restful']" }
schedule:
- cron: '00 14 * * 0-4'
jobs:
linux-build: ...
download_pkgs: ...
test_quantization: ...
test_tools: ...
test_restful: ...
get_coverage_report: ...
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| repo_org | string | No | Repository organization/name (default: InternLM/lmdeploy) |
| repo_ref | string | No | Branch, tag, or commit ID (default: main) |
| backend | string | Yes | JSON list of backends to test |
| model | string | Yes | JSON list of model types: llm, mllm |
| function | string | Yes | JSON list of test functions |
| offline_mode | boolean | Yes | Whether to use pre-prepared offline packages |
| regression_func | string | Yes | JSON list: quant, tools, restful |
Outputs
| Name | Type | Description |
|---|---|---|
| Build artifacts | wheel file | lmdeploy wheel package for CUDA 12.8 |
| Test reports | Allure reports | Test results stored in REPORT_DIR |
| Coverage report | XML/text | Combined code coverage report |
Usage Examples
# Manual trigger to test only pytorch backend on 5080:
# Go to Actions > daily_ete_test_5080 > Run workflow
# Set backend: "['pytorch']"
# Set regression_func: "['tools', 'restful']"