Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Workflow:Ray project Ray Build and Release Pipeline

From Leeroopedia
Knowledge Sources
Domains Build_Systems, CI_CD, Release_Engineering
Last Updated 2026-02-13 16:00 GMT

Overview

End-to-end process for building Ray from source, producing distributable artifacts (Python wheels, Java JARs, Docker images), running release validation tests, and publishing releases.

Description

This workflow covers the complete build and release pipeline for the Ray framework. It encompasses compiling dependencies, building Python wheels for multiple platforms (Linux x86_64, Linux aarch64, macOS), building Java JARs with multiplatform native binary packaging, constructing Docker images from built wheels, running comprehensive release validation tests with retry logic, and deploying artifacts to package repositories. The pipeline is orchestrated through Buildkite CI with conditional test execution based on changed file paths.

Usage

Execute this workflow when preparing a Ray release, validating a release candidate, or when contributing changes that require full build verification. Also applicable for building custom Ray distributions from source for specific platform or dependency requirements.

Execution Steps

Step 1: Compile Dependencies

Resolve and pin all Python dependencies using pip-tools compilation. The CI orchestration script processes multiple requirement files (test, cloud, ML, Docker, tune) and generates a fully pinned lockfile. This ensures reproducible builds across all environments and platforms.

Key considerations:

  • Dependency compilation uses pip-tools v7.4.1 for determinism
  • Separate lockfiles are maintained per Python version (3.10, 3.11, 3.12, 3.13)
  • Compilation is skipped on aarch64 where pip-tools is unavailable
  • Constraint files prevent version conflicts during installation

Step 2: Build Python Wheels

Produce platform-specific Python wheel packages using the build-wheel pipeline. The build script detects the current platform, downloads the appropriate version of the raymake build tool, and delegates to the Python build script. Wheels are produced for manylinux2014 (x86_64, aarch64) and macOS (arm64) targets.

Key considerations:

  • The raymake tool version is pinned via the .rayciversion file
  • Platform detection maps uname output to wheel platform tags
  • Cython version is pinned to 3.0.12 to prevent ABI breakage
  • Wheel commit strings are validated against the expected Git commit

Step 3: Build Java JARs

Compile Java SDK artifacts (api, runtime, serve modules) using Maven with Bazel-generated dependencies. For each module, the build generates POM files, proto files, and Maven dependency lists via Bazel, then runs Maven package and install. Multiplatform JARs are assembled by downloading platform-specific pre-built JARs from S3, extracting native binaries, and repackaging into a single cross-platform artifact.

Key considerations:

  • Requires Java 1.8 for compilation
  • Maven builds use 16 parallel threads for performance
  • Multiplatform assembly downloads Linux and macOS JARs from S3 with retry logic
  • Native binaries are extracted from platform JARs into the native_dependencies directory
  • Deployment to Maven Central requires OSSRH credentials and occurs only on master branch

Step 4: Build Docker Images

Construct Docker images in two layers. First, the base-deps image installs all Python dependencies from the compiled lockfiles onto the base OS image (Ubuntu 22.04 for CPU, CUDA 12.8.1 for GPU). Second, the ray image installs the pre-built wheel onto the base-deps image. BuildKit is used for efficient, parallel layer construction.

Key considerations:

  • GPU images use nvidia/cuda base with cuDNN development libraries
  • Wheels are downloaded from S3 rather than built in-container
  • Both ray and ray-cpp wheels are installed
  • Images are tagged as rayproject/base-deps:dev and rayproject/ray:dev

Step 5: Run Release Validation Tests

Execute the comprehensive release test suite against the built artifacts. The release test runner installs the wheel, runs the test script, and implements intelligent retry logic with random backoff (30-90 minutes between attempts). Exit codes are classified into categories (success, runtime error, infrastructure error, timeout, command error) for diagnostic purposes.

Key considerations:

  • Tests are defined in release_tests.yaml (4877 lines of test definitions)
  • Commit verification ensures test code matches the wheel being tested
  • Maximum retry count is configurable (default: 1 retry)
  • Signal handling propagates SIGTERM/SIGINT to child processes
  • Artifacts are collected to a designated results directory
  • Buildkite retry code (79) enables UI-triggered retries

Step 6: Publish Release Artifacts

Deploy validated artifacts to their respective package repositories. Python wheels are uploaded to PyPI, Java JARs are deployed to Maven Central via the OSSRH repository, and Docker images are pushed to Docker Hub. This step executes only from the master branch after all validation passes.

Key considerations:

  • Maven deployment uses release profile with optional GPG signing
  • Publishing gates on master branch and non-PR context
  • Docker images require separate tagging for versioned releases

Execution Diagram

GitHub URL

Workflow Repository