Implementation:Ray project Ray Buildkite ML Pipeline

Knowledge Sources	Ray
Domains	CI, Testing, ML, Train, Tune
Last Updated	2026-02-13 16:00 GMT

Overview

This file defines the Buildkite "ml tests" pipeline group that builds and tests Ray's ML libraries including Ray Train (v1 and v2), Ray Tune, Ray Air, and GPU-accelerated training integrations.

Description

The .buildkite/ml.rayci.yml pipeline creates multiple Docker build images: minimal ML build, ML CPU build, ML GPU build, and Lightning 2-GPU build, across Python versions 3.10 and 3.12. It then runs comprehensive test suites for Ray Train v1 (CPU and GPU), Ray Train v2 (CPU and GPU), Ray Tune, Ray Air, train+tune combinations, rllib+tune combinations, release tests, minimal installation tests, doc tests, GPU Lightning 2.0 tests, and authentication tests (W&B, Comet). The pipeline also includes flaky test reruns for both CPU and GPU configurations. All steps depend on forge, ray-core-build, and ray-dashboard-build.

Usage

Developers modify this file when adding new ML library test suites, updating GPU test configurations, changing the Train v1/v2 test split, adding new Python version matrices, or adjusting authentication test credentials. It is updated when new ML framework integrations (like PyTorch Lightning versions) need CI coverage.

Code Reference

Source Location

Repository: Ray
File: .buildkite/ml.rayci.yml
Lines: 1-345

Signature

group: ml tests
depends_on:
  - forge
  - ray-core-build
  - ray-dashboard-build
steps:
  # builds
  - name: minbuild-ml
    label: "wanda: minbuild-ml-py{{matrix}}"
    wanda: ci/docker/min.build.wanda.yaml
    depends_on: oss-ci-base_test-multipy
    matrix:
      - "3.10"
    env:
      PYTHON_VERSION: "{{matrix}}"
      EXTRA_DEPENDENCY: ml
    tags: cibase

  - name: mlbuild-multipy
    label: "wanda: mlbuild-py{{matrix}}"
    wanda: ci/docker/ml.build.wanda.yaml
    depends_on: oss-ci-base_ml-multipy

Import

Configuration file, referenced by the Buildkite CI pipeline system. Loaded as part of the RayCI pipeline group mechanism and depends on forge, ray-core-build, and ray-dashboard-build steps.

I/O Contract

Inputs

Name	Type	Required	Description
`forge`	dependency	yes	The forge build environment providing Bazel and tooling
`ray-core-build`	dependency	yes	Core Ray binary build artifacts
`ray-dashboard-build`	dependency	yes	Dashboard build artifacts
`oss-ci-base_ml-multipy`	dependency	yes	ML base CI Docker images
`oss-ci-base_gpu-multipy`	dependency	yes	GPU-enabled base CI Docker images
`oss-ci-base_test-multipy`	dependency	yes	Test base images for minimal builds
`oss-ci-base_build-multipy`	dependency	conditional	Build base images for soft import tests
W&B / Comet API keys	env vars	conditional	Required for authentication tests (WANDB_API_KEY, COMET_API_KEY)

Outputs

Name	Type	Description
`minbuild-ml`	Docker image	Minimal ML build image (Python 3.10)
`mlbuild-multipy`	Docker image	ML CPU build images (Python 3.10, 3.12)
`mlgpubuild-multipy`	Docker image	ML GPU build images (Python 3.10, 3.12)
`mllightning2gpubuild`	Docker image	PyTorch Lightning 2.0 GPU build image
Test results	CI artifacts	Test results from all ML test suites

Usage Examples

The pipeline covers both CPU and GPU ML test suites:

# Train v2 tests with parallelism
- label: ":bullettrain_front: ml: train v2 tests"
  tags: train
  instance_type: large
  parallelism: 2
  commands:
    - bazel run //ci/ray_ci:test_in_docker -- //python/ray/train/... ml
      --workers "$${BUILDKITE_PARALLEL_JOB_COUNT}"
      --worker-id "$${BUILDKITE_PARALLEL_JOB}"
      --parallelism-per-worker 3
      --python-version 3.10 --build-name mlbuild-py3.10
      --only-tags train_v2

# GPU training tests on gpu-large instances
- label: ":train: ml: train v1 gpu tests"
  tags: [train_gpu, gpu]
  instance_type: gpu-large
  parallelism: 2
  commands:
    - bazel run //ci/ray_ci:test_in_docker -- //python/ray/train/... ml
      --build-name mlgpubuild-py3.10 --python-version 3.10
      --only-tags gpu

# Tune tests
- label: ":train: ml: tune tests"
  tags: tune
  instance_type: large
  commands:
    - bazel run //ci/ray_ci:test_in_docker -- //python/ray/tune/... ml
      --parallelism-per-worker 3
      --python-version 3.10 --build-name mlbuild-py3.10
      --except-tags doctest,soft_imports,rllib

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment