Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Environment:AnswerDotAI RAGatouille GPU CUDA Runtime

From Leeroopedia
Knowledge Sources
Domains Infrastructure, GPU_Computing, Information_Retrieval
Last Updated 2026-02-12 12:00 GMT

Overview

Optional NVIDIA GPU environment with CUDA support for accelerated indexing, search, and training operations in RAGatouille.

Description

This environment extends the base Python dependencies with GPU acceleration via CUDA. RAGatouille is designed to work on both CPU and GPU, with automatic GPU detection via torch.cuda.is_available() and torch.cuda.device_count(). When a GPU is available, operations like document encoding, index building (KMeans clustering), ColBERT scoring, and model training are dispatched to the GPU for significant speedups. The GPU environment also enables the use of faiss-gpu instead of faiss-cpu for faster FAISS-based indexing on large collections (>75k documents).

Usage

Use this environment when working with large document collections (>10k documents), when training or fine-tuning ColBERT models, or when low-latency search is required. CPU-only mode works but is substantially slower for indexing and training. The GPU is optional for small-scale search and reranking.

System Requirements

Category Requirement Notes
OS Linux (Ubuntu 18.04+) Windows not supported (WSL2 may work)
Hardware NVIDIA GPU with CUDA support Any modern NVIDIA GPU works; more VRAM helps with larger batch sizes
Driver NVIDIA driver compatible with CUDA toolkit Check with `nvidia-smi`
CUDA Compatible with PyTorch >= 1.13 PyTorch handles CUDA version matching

Dependencies

System Packages

  • NVIDIA GPU driver (compatible with chosen CUDA version)
  • CUDA toolkit (via PyTorch's bundled CUDA)

Python Packages

  • `torch` >= 1.13 (with CUDA support)
  • `faiss-gpu` — optional, for GPU-accelerated FAISS indexing on large collections

Credentials

No additional credentials required beyond the base Python environment.

Quick Install

# Install PyTorch with CUDA support (example for CUDA 11.8)
pip install torch --index-url https://download.pytorch.org/whl/cu118

# Optional: Replace faiss-cpu with faiss-gpu for large collection indexing
pip uninstall -y faiss-cpu && pip install faiss-gpu

# Install RAGatouille
pip install RAGatouille

Code Evidence

Automatic GPU detection and count from `ragatouille/models/colbert.py:39-40`:

if n_gpu == -1:
    n_gpu = 1 if torch.cuda.device_count() == 0 else torch.cuda.device_count()

GPU dispatch for ColBERT scoring from `ragatouille/models/colbert.py:457-458`:

if ColBERTConfig().total_visible_gpus > 0:
    Q, D_padded, D_mask = Q.cuda(), D_padded.cuda(), D_mask.cuda()

FAISS GPU check and warning from `ragatouille/models/index.py:223-236`:

if torch.cuda.is_available():
    import faiss

    if not hasattr(faiss, "StandardGpuResources"):
        print(
            "WARNING! You have a GPU available, but only `faiss-cpu` is currently installed.\n",
            "This means that indexing will be slow. To make use of your GPU.\n"
            "Please install `faiss-gpu` by running:\n"
            "pip uninstall --y faiss-cpu & pip install faiss-gpu\n",
        )
        print("Will continue with CPU indexing in 5 seconds...")
        time.sleep(5)

KMeans GPU/CPU device selection from `ragatouille/models/torch_kmeans.py:35-36`:

device = torch.device("cuda" if use_gpu else "cpu")
sample = sample.to(device)

GPU half-precision optimization for centroids from `ragatouille/models/torch_kmeans.py:16-19`:

if self.use_gpu:
    centroids = centroids.half()
else:
    centroids = centroids.float()

Common Errors

Error Message Cause Solution
`WARNING! You have a GPU available, but only faiss-cpu is currently installed.` faiss-gpu not installed while GPU is available `pip uninstall -y faiss-cpu && pip install faiss-gpu`
`torch.cuda.is_available()` returns False CUDA drivers not properly installed Install NVIDIA drivers and PyTorch with CUDA support
CUDA out of memory during indexing Insufficient GPU VRAM for batch size Reduce `bsize` parameter or use `use_faiss=False` (PyTorch KMeans)

Compatibility Notes

  • CPU-only mode: RAGatouille works fully on CPU. When no GPU is detected, `n_gpu` defaults to 1 (CPU mode) and all tensor operations stay on CPU.
  • Multi-GPU: Setting `n_gpu=-1` (default) auto-detects all available GPUs via `torch.cuda.device_count()`.
  • faiss-gpu vs faiss-cpu: The code detects at runtime whether `faiss.StandardGpuResources` exists. If only faiss-cpu is installed with a GPU present, it warns and continues with CPU FAISS after a 5-second delay.
  • Half-precision centroids: On GPU, KMeans centroids are stored in float16 for memory efficiency. On CPU, they remain in float32.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment