Environment:Pola rs Polars GPU Execution Environment

Knowledge Sources	Polars GPU Support RAPIDS cuDF
Domains	Infrastructure, GPU_Acceleration
Last Updated	2026-02-09 10:00 GMT

Overview

NVIDIA GPU environment with CUDA 12 and RAPIDS cuDF for GPU-accelerated query execution in the Polars Lazy API.

Description

This environment provides GPU-accelerated execution for Polars lazy queries using NVIDIA RAPIDS cuDF as the physical execution backend. When invoked via `collect(engine="gpu")`, the optimized query plan is dispatched to the GPU. The GPU engine supports many core expressions and data types but not the full Polars expression API. Unsupported queries transparently fall back to the CPU engine unless `raise_on_fail=True` is set. Results are always returned as standard CPU-backed Polars DataFrames.

Usage

Use this environment when your workflow is dominated by grouped aggregations and joins on datasets that fit in GPU memory. I/O-bound queries typically show similar performance on GPU and CPU. Raw datasets of 50-100 GiB fit well with a GPU with 80 GiB of memory, depending on the workflow.

System Requirements

Category	Requirement	Notes
GPU	NVIDIA Volta or higher	Compute capability >= 7.0
GPU VRAM	8GB minimum	80GB recommended for large datasets (A100/H100)
CUDA	CUDA 12	CUDA 11 support ending with RAPIDS v25.06
OS	Linux or WSL2	Native Windows not supported
Python	>= 3.10	Same as core Polars requirement

Dependencies

System Packages

NVIDIA GPU driver (compatible with CUDA 12)
CUDA Toolkit 12.x

Python Packages

`polars` >= 1.38.1
`cudf-polars-cu12` (installed via `pip install polars[gpu]`)

CUDA 11 (Deprecated)

`cudf-polars-cu11` == 25.06 (pinned version, CUDA 11 support dropping in RAPIDS 25.08)

Credentials

No additional credentials required beyond the base Environment:Pola_rs_Polars_Python_Runtime_Environment.

Quick Install

# Standard GPU install (CUDA 12)
pip install polars[gpu]

# For CUDA 11 systems (deprecated, pinned to v25.06)
pip install polars cudf-polars-cu11==25.06

Code Evidence

GPU optional dependency from `py-polars/pyproject.toml:87`:

gpu = ["cudf-polars-cu12"]

GPU usage from `docs/source/user-guide/gpu-support.md:42-50`:

result = q.collect(engine="gpu")

# With detailed control
result = q.collect(engine=pl.GPUEngine(device=1))

# Disable fallback to raise on unsupported
q.collect(engine=pl.GPUEngine(raise_on_fail=True))

System requirements from `docs/source/user-guide/gpu-support.md:9-13`:

- NVIDIA Volta or higher GPU with compute capability 7.0+
- CUDA 12 (CUDA 11 support ends with RAPIDS v25.06)
- Linux or Windows Subsystem for Linux 2 (WSL2)

Common Errors

Error Message	Cause	Solution
`PerformanceWarning: Query execution with GPU not supported`	Unsupported operation in query	Check verbose output; operation falls back to CPU
`ComputeError: 'cuda' conversion failed: NotImplementedError`	Operation not implemented on GPU with `raise_on_fail=True`	Remove `raise_on_fail=True` for CPU fallback, or restructure query
`CUDA out of memory`	Dataset too large for GPU VRAM	Reduce dataset size or use CPU engine
`ImportError: cudf_polars not found`	GPU package not installed	`pip install polars[gpu]`

Compatibility Notes

Supported: LazyFrame API, SQL API, I/O from CSV/Parquet/NDJSON, numeric/logical/string/datetime types, aggregations, joins, filters, concatenation
Not Supported: Eager DataFrame API, Streaming API, Date/Categorical/Enum/Time/Array/Binary/Object types, time series resampling, folds, UDFs, Excel/Database I/O
Fallback: By default, unsupported queries transparently fall back to CPU; use `raise_on_fail=True` on `GPUEngine` to detect this
Testing: GPU engine passes 99.2% of Polars unit tests with fallback enabled, 88.8% without fallback
Multiprocessing: Avoid using Python multiprocessing with GPU engine as they compete for system resources

Related Pages

Implementation:Pola_rs_Polars_LazyFrame_Collect

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment