Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Environment:Pyro ppl Pyro CUDA GPU Acceleration

From Leeroopedia


Knowledge Sources
Domains Infrastructure, GPU_Computing
Last Updated 2026-02-09 09:00 GMT

Overview

Optional CUDA GPU environment for accelerating Pyro inference, particularly MCMC sampling and large-scale SVI with neural network guides.

Description

This environment extends the core Python/PyTorch environment with NVIDIA CUDA support. While Pyro runs on CPU by default, GPU acceleration significantly improves performance for MCMC (HMC/NUTS) inference and neural network-based guides (VAEs, amortized inference). The Docker configuration supports both CPU and CUDA builds via configurable arguments.

Usage

Use this environment when running MCMC inference on large models, VAE training with neural network encoders/decoders, or any workflow where PyTorch GPU acceleration provides a speedup. Tests can be directed to run on GPU via the `PYRO_DEVICE` environment variable.

System Requirements

Category Requirement Notes
OS Linux (Ubuntu 24.04 in Docker) Docker image uses `ubuntu:24.04` as base
Hardware NVIDIA GPU Any CUDA-capable GPU
Driver NVIDIA Driver Compatible with installed CUDA toolkit
Software CUDA Toolkit Via PyTorch CUDA wheel (e.g., cu118, cu121)

Dependencies

System Packages

  • NVIDIA GPU driver (host system)
  • CUDA toolkit (bundled with PyTorch CUDA wheels)
  • `magma-cuda` (optional, for Docker source builds)

Python Packages

  • `torch` >= 2.0 (CUDA variant)
  • `torchvision` >= 0.15.0 (CUDA variant, optional)
  • `torchaudio` (CUDA variant, optional)

Credentials

No credentials required for GPU usage.

Quick Install

# Install PyTorch with CUDA 11.8 support
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

# Install Pyro
pip install pyro-ppl

# Verify GPU is available
python -c "import torch; print(torch.cuda.is_available())"

Code Evidence

Test configuration environment variable for device selection from `tests/conftest.py:12-14`:

DTYPE = getattr(torch, os.environ.get("PYRO_DTYPE", "float64"))
torch.set_default_dtype(DTYPE)
torch.set_default_device(os.environ.get("PYRO_DEVICE", "cpu"))

Docker CUDA build support from `docker/Dockerfile:5-12`:

ARG base_img=ubuntu:24.04
FROM ${base_img}

# Optional args
ARG python_version=3
ARG pyro_branch=release
ARG pytorch_whl=cpu
ARG pytorch_branch=release

Docker install script PyTorch CUDA installation from `docker/install.sh:16`:

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/${pytorch_whl}

CUDA tensor cloning for multiprocessing from `pyro/infer/mcmc/api.py:560`:

# XXX we clone CUDA tensor args to resolve the issue "Invalid device pointer"
args = [arg.detach() if torch.is_tensor(arg) else arg for arg in args]

Common Errors

Error Message Cause Solution
`RuntimeError: CUDA error: no kernel image is available` GPU compute capability mismatch Install PyTorch built for your GPU architecture
`RuntimeError: CUDA out of memory` Insufficient GPU VRAM Reduce model size, batch size, or use CPU for that step
`Invalid device pointer` in MCMC multiprocessing CUDA tensor pointer invalidation across processes Pyro handles this internally by detaching tensors; ensure using latest Pyro version

Compatibility Notes

  • CPU default: Pyro defaults to CPU (`PYRO_DEVICE=cpu`); set `PYRO_DEVICE=cuda` for GPU testing
  • MCMC multiprocessing: Multi-chain MCMC on GPU requires tensor detaching/cloning to avoid pointer invalidation across processes
  • Docker: The official Docker image supports both CPU and CUDA via the `pytorch_whl` build argument
  • PYRO_DTYPE: Default test dtype is `float64`; set `PYRO_DTYPE=float32` for faster GPU computation at reduced precision

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment