Implementation:InternLM Lmdeploy Daily Ete Test 3090

Knowledge Sources	InternLM_Lmdeploy
Domains	CI, Testing, GPU
Last Updated	2026-02-07 15:00 GMT

Overview

A GitHub Actions workflow that runs daily end-to-end tests for lmdeploy on NVIDIA RTX 3090 GPUs, testing quantization, tool functionality, and RESTful API interfaces.

Description

The daily_ete_test_3090.yml workflow is a hardware-specific variant of the daily end-to-end test suite targeting RTX 3090 GPUs. It runs on self-hosted runners labeled 3090-r1 and uses the openmmlab/lmdeploy:latest-cu12 Docker image (CUDA 12.4).

The workflow includes:

linux-build: Builds the lmdeploy wheel on ubuntu-latest with CUDA 12.4.
download_pkgs: Downloads and stages build artifacts on the 3090 runner.
test_quantization: Tests AWQ (W4A16) and W8A8 quantization with test_3090 pytest markers.
test_tools: Matrix job for backends x models x functions, using single-GPU tests only (gpu_num_1 marker with test_3090).
test_restful: Tests RESTful API with internlm3-8b-instruct and Qwen3-8B models at TP=1, including chat completions, completions, and generate endpoints with log probability support.
get_coverage_report: Combines coverage data from all test jobs.

Key differences from the A100 variant: uses CUDA 12.4, runs only single-GPU tests, has TEST_ENV: 3090, and tests a smaller set of models appropriate for 24GB VRAM.

Usage

Triggered automatically on weekdays at 14:00 UTC or manually via GitHub Actions dispatch. Default regression functions include quant, tools, and restful.

Code Reference

Source Location

Repository: InternLM_Lmdeploy
File: .github/workflows/daily_ete_test_3090.yml
Lines: 1-424

Signature

name: daily_ete_test_3090

on:
  workflow_dispatch:
    inputs:
      repo_org: { type: string, default: 'InternLM/lmdeploy' }
      repo_ref: { type: string, default: 'main' }
      backend: { type: string, default: "['turbomind', 'pytorch']" }
      model: { type: string, default: "['llm','mllm']" }
      function: { type: string, default: '["pipeline", "restful", "chat"]' }
      offline_mode: { type: boolean, default: false }
      regression_func: { type: string, default: "['quant', 'tools', 'restful']" }
  schedule:
    - cron: '00 14 * * 0-4'

jobs:
  linux-build: ...
  download_pkgs: ...
  test_quantization: ...
  test_tools: ...
  test_restful: ...
  get_coverage_report: ...

I/O Contract

Inputs

Name	Type	Required	Description
repo_org	string	No	Repository organization/name (default: InternLM/lmdeploy)
repo_ref	string	No	Branch, tag, or commit ID (default: main)
backend	string	Yes	JSON list of backends to test
model	string	Yes	JSON list of model types: llm, mllm
function	string	Yes	JSON list of test functions
offline_mode	boolean	Yes	Whether to use pre-prepared offline packages
regression_func	string	Yes	JSON list: quant, tools, restful

Outputs

Name	Type	Description
Build artifacts	wheel file	lmdeploy wheel package for CUDA 12.4
Test reports	Allure reports	Test results stored in REPORT_DIR
Coverage report	XML/text	Combined code coverage report

Usage Examples

# Manual trigger to test only turbomind backend on 3090:
# Go to Actions > daily_ete_test_3090 > Run workflow
# Set backend: "['turbomind']"
# Set regression_func: "['tools']"

Related Pages

Environment:InternLM_Lmdeploy_CUDA_GPU_Runtime

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment