Environment:Pyro ppl Pyro CUDA GPU Acceleration
| Knowledge Sources | |
|---|---|
| Domains | Infrastructure, GPU_Computing |
| Last Updated | 2026-02-09 09:00 GMT |
Overview
Optional CUDA GPU environment for accelerating Pyro inference, particularly MCMC sampling and large-scale SVI with neural network guides.
Description
This environment extends the core Python/PyTorch environment with NVIDIA CUDA support. While Pyro runs on CPU by default, GPU acceleration significantly improves performance for MCMC (HMC/NUTS) inference and neural network-based guides (VAEs, amortized inference). The Docker configuration supports both CPU and CUDA builds via configurable arguments.
Usage
Use this environment when running MCMC inference on large models, VAE training with neural network encoders/decoders, or any workflow where PyTorch GPU acceleration provides a speedup. Tests can be directed to run on GPU via the `PYRO_DEVICE` environment variable.
System Requirements
| Category | Requirement | Notes |
|---|---|---|
| OS | Linux (Ubuntu 24.04 in Docker) | Docker image uses `ubuntu:24.04` as base |
| Hardware | NVIDIA GPU | Any CUDA-capable GPU |
| Driver | NVIDIA Driver | Compatible with installed CUDA toolkit |
| Software | CUDA Toolkit | Via PyTorch CUDA wheel (e.g., cu118, cu121) |
Dependencies
System Packages
- NVIDIA GPU driver (host system)
- CUDA toolkit (bundled with PyTorch CUDA wheels)
- `magma-cuda` (optional, for Docker source builds)
Python Packages
- `torch` >= 2.0 (CUDA variant)
- `torchvision` >= 0.15.0 (CUDA variant, optional)
- `torchaudio` (CUDA variant, optional)
Credentials
No credentials required for GPU usage.
Quick Install
# Install PyTorch with CUDA 11.8 support
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
# Install Pyro
pip install pyro-ppl
# Verify GPU is available
python -c "import torch; print(torch.cuda.is_available())"
Code Evidence
Test configuration environment variable for device selection from `tests/conftest.py:12-14`:
DTYPE = getattr(torch, os.environ.get("PYRO_DTYPE", "float64"))
torch.set_default_dtype(DTYPE)
torch.set_default_device(os.environ.get("PYRO_DEVICE", "cpu"))
Docker CUDA build support from `docker/Dockerfile:5-12`:
ARG base_img=ubuntu:24.04
FROM ${base_img}
# Optional args
ARG python_version=3
ARG pyro_branch=release
ARG pytorch_whl=cpu
ARG pytorch_branch=release
Docker install script PyTorch CUDA installation from `docker/install.sh:16`:
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/${pytorch_whl}
CUDA tensor cloning for multiprocessing from `pyro/infer/mcmc/api.py:560`:
# XXX we clone CUDA tensor args to resolve the issue "Invalid device pointer"
args = [arg.detach() if torch.is_tensor(arg) else arg for arg in args]
Common Errors
| Error Message | Cause | Solution |
|---|---|---|
| `RuntimeError: CUDA error: no kernel image is available` | GPU compute capability mismatch | Install PyTorch built for your GPU architecture |
| `RuntimeError: CUDA out of memory` | Insufficient GPU VRAM | Reduce model size, batch size, or use CPU for that step |
| `Invalid device pointer` in MCMC multiprocessing | CUDA tensor pointer invalidation across processes | Pyro handles this internally by detaching tensors; ensure using latest Pyro version |
Compatibility Notes
- CPU default: Pyro defaults to CPU (`PYRO_DEVICE=cpu`); set `PYRO_DEVICE=cuda` for GPU testing
- MCMC multiprocessing: Multi-chain MCMC on GPU requires tensor detaching/cloning to avoid pointer invalidation across processes
- Docker: The official Docker image supports both CPU and CUDA via the `pytorch_whl` build argument
- PYRO_DTYPE: Default test dtype is `float64`; set `PYRO_DTYPE=float32` for faster GPU computation at reduced precision