Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Sgl project Sglang PR Test Workflow

From Leeroopedia


Knowledge Sources
Domains CI/CD, Testing
Last Updated 2026-02-10 00:00 GMT

Overview

The main GitHub Actions CI workflow for testing pull requests, running scheduled nightly tests, and managing targeted stage reruns across multiple GPU architectures.

Description

pr-test.yml is a 1600+ line GitHub Actions workflow that orchestrates the entire test pipeline for the SGLang project. It implements a multi-stage pipeline with three sequential stages (A, B, C) that run across diverse GPU hardware including 5090, H100, H200, H20, B200, and GB200.

The workflow supports four trigger modes:

  • pull_request: Runs change-detected tests on PRs targeting main
  • schedule: Runs every 6 hours with full test parallelism (max_parallel=14)
  • workflow_dispatch: Supports targeted stage reruns via /rerun-stage and configurable FlashInfer versions
  • workflow_call: Allows other workflows to invoke it with custom refs

Key architectural features include:

  • Change detection via dorny/paths-filter (for PRs) and GitHub API comparison (for workflow_dispatch with target_stage)
  • sgl-kernel wheel builds on both x64 and ARM when kernel code changes are detected
  • Sequential stage execution for PRs using wait jobs that poll the GitHub API
  • Parallel execution for scheduled runs to enable easier individual stage retries
  • Concurrency groups that cancel previous runs on the same branch
  • Continue-on-error mode for scheduled/full test runs

Usage

This workflow is automatically triggered on pull requests to the main branch and via a cron schedule every 6 hours. It can also be manually dispatched to target specific stages or test specific FlashInfer versions. Contributors with CI permissions can use /rerun-stage and /tag-and-rerun-ci commands.

Code Reference

Source Location

Signature

name: PR Test

on:
  schedule:
    - cron: '0 */6 * * *'
  pull_request:
    branches: [main]
  workflow_dispatch:
    inputs:
      version:
        description: "FlashInfer version"
        type: choice
        default: "release"
      target_stage:
        description: "Specific stage to run"
        type: string
      force_continue_on_error:
        type: boolean
      pr_head_sha:
        description: "PR head SHA for /rerun-stage on fork PRs"
        type: string
      test_parallel_dispatch:
        type: boolean
  workflow_call:
    inputs:
      ref:
        type: string
      run_all_tests:
        type: boolean

Import

N/A -- This is a GitHub Actions YAML workflow definition.

I/O Contract

Inputs

Name Type Required Description
version choice (release/nightly) No FlashInfer version to test against
target_stage string No Specific stage name to run (e.g., stage-b-test-large-1-gpu)
force_continue_on_error boolean No Force continue-on-error behavior for all stages
pr_head_sha string No PR head SHA for /rerun-stage on fork PRs
test_parallel_dispatch boolean No Simulate scheduled parallel dispatch behavior
ref string No Git ref for workflow_call invocations
run_all_tests boolean No Run all tests regardless of change detection

Outputs

Name Type Description
main_package boolean Whether main package changes were detected
sgl_kernel boolean Whether sgl-kernel changes were detected
jit_kernel boolean Whether JIT kernel changes were detected
multimodal_gen boolean Whether multimodal gen changes were detected
max_parallel integer Maximum parallel job count (3 for PRs, 14 for scheduled)
b200_runner string B200 runner tag based on kernel changes
pr-test-finish result success/failure Overall CI pass/fail status

Usage Examples

Change Detection Filter Paths

filters: |
  main_package:
    - "python/sglang/!(multimodal_gen)/**"
    - "python/pyproject.toml"
    - "scripts/ci/cuda/*"
    - "test/**"
    - ".github/workflows/pr-test.yml"
  sgl_kernel:
    - "sgl-kernel/**"
  jit_kernel:
    - "python/sglang/jit_kernel/**"
  multimodal_gen:
    - "python/sglang/multimodal_gen/**"

Stage B Large 1-GPU Test Job

stage-b-test-large-1-gpu:
  needs: [check-changes, call-gate, wait-for-stage-a, sgl-kernel-build-wheels]
  runs-on: 1-gpu-runner
  timeout-minutes: 240
  strategy:
    fail-fast: false
    max-parallel: ${{ fromJson(needs.check-changes.outputs.max_parallel) }}
    matrix:
      partition: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment