Environment:Ray project Ray Docker GPU Environment
| Knowledge Sources | |
|---|---|
| Domains | Infrastructure, Deep_Learning, GPU |
| Last Updated | 2026-02-13 16:35 GMT |
Overview
Docker-based GPU environment using NVIDIA CUDA 12.8.1 with cuDNN on Ubuntu 22.04, Python 3.10, and Miniforge for conda package management.
Description
This environment defines the Docker container setup for GPU-accelerated Ray workloads. The base CPU image uses Ubuntu 22.04, while the GPU variant builds on `nvidia/cuda:12.8.1-cudnn-devel-ubuntu22.04`. It includes Miniforge 24.11.3-0 for conda-based Python management, HAProxy 2.8.12 for HTTP proxy support, and system dependencies including jemalloc, CMake, and various networking tools. The container supports both x86_64 and aarch64 architectures.
Usage
Use this environment when running Ray workloads in Docker containers, especially those requiring GPU acceleration. This is the mandatory prerequisite for Docker-based CI/CD pipelines, containerized Serve deployments, and GPU-accelerated training or inference workloads.
System Requirements
| Category | Requirement | Notes |
|---|---|---|
| OS | Ubuntu 22.04 LTS (inside container) | Base image for both CPU and GPU variants |
| Hardware | NVIDIA GPU (optional) | Required only for GPU variant |
| CUDA | 12.8.1 with cuDNN | GPU variant base image |
| Architecture | x86_64 or aarch64 | Both supported via Miniforge |
| Docker | Docker Engine 19.03+ | For GPU: NVIDIA Container Toolkit required |
Dependencies
Base Image
- CPU: `ubuntu:22.04`
- GPU: `nvidia/cuda:12.8.1-cudnn-devel-ubuntu22.04`
System Packages
- `sudo`
- `tzdata`
- `git`
- `libjemalloc-dev` (jemalloc memory allocator)
- `wget`
- `cmake`
- `g++`
- `zlib1g-dev`
- `tmux`, `screen`, `rsync` (autoscaler)
- `netbase`, `openssh-client`, `gnupg`
- `socat`, `liblua5.3-0` (HAProxy dependencies)
Python Environment
- Miniforge 24.11.3-0
- Python 3.10 (default)
- HAProxy 2.8.12
Credentials
No credentials are embedded in the Docker image. Runtime credentials are injected via environment variables:
- `NVIDIA_VISIBLE_DEVICES`: GPU visibility control (set by NVIDIA Container Toolkit)
Quick Install
# Build CPU Docker image
bash build-docker.sh --base-image ubuntu:22.04
# Build GPU Docker image
bash build-docker.sh --gpu --base-image nvidia/cuda:12.8.1-cudnn-devel-ubuntu22.04
# Run with GPU support
docker run --gpus all rayproject/ray:latest-gpu
Code Evidence
GPU base image selection from `build-docker.sh:7-18`:
# Default base image (CPU)
BASE_IMAGE="ubuntu:22.04"
# GPU variant
GPU_BASE_IMAGE="nvidia/cuda:12.8.1-cudnn-devel-ubuntu22.04"
Architecture detection from `docker/base-deps/Dockerfile:122-130`:
# Dynamic architecture detection
HOSTTYPE values: x86_64, aarch64
Miniforge version pinning from `docker/base-deps/Dockerfile:133-143`:
# Miniforge version: 24.11.3-0
# Supports architectures: x86_64, aarch64
HAProxy version from `docker/base-deps/Dockerfile:27`:
HAPROXY_VERSION="2.8.12"
Common Errors
| Error Message | Cause | Solution |
|---|---|---|
| `nvidia-smi: command not found` | NVIDIA Container Toolkit not installed | Install nvidia-docker2 and NVIDIA Container Toolkit |
| `docker: Error response from daemon: could not select device driver` | Missing GPU runtime | Run `docker run --gpus all` with NVIDIA runtime configured |
| Architecture mismatch in Miniforge download | Wrong HOSTTYPE detected | Verify `uname -m` returns x86_64 or aarch64 |
Compatibility Notes
- GPU variant: Requires NVIDIA Container Toolkit (nvidia-docker2) on the host.
- aarch64: Both CPU and GPU images support ARM64 architecture via architecture-aware Miniforge downloads.
- HAProxy: Version 2.8.12 is compiled from source for HTTP Serve proxy support.
- jemalloc: Included as the default memory allocator for improved performance on Linux.