Environment:Kornia Kornia CUDA GPU Environment

Knowledge Sources	Kornia PyTorch CUDA
Domains	Infrastructure, GPU_Computing
Last Updated	2026-02-09 15:00 GMT

Overview

CUDA GPU environment for accelerated Kornia operations including FlashAttention, mixed precision, and device-optimized compute paths.

Description

While Kornia works fully on CPU, many operations have GPU-optimized code paths that provide significant performance improvements. This environment covers NVIDIA CUDA GPU support including FlashAttention for feature matching (LightGlue, LoFTR), mixed-precision (AMP) support for deep models (DeDoDe, LightGlue), and device-specific branching that uses conv2d on GPU vs einsum on CPU for color transformations. The project supports CUDA 12.1 and 12.4 via PyTorch's custom indices.

Usage

Use this environment when running compute-intensive operations such as deep feature matching (LoFTR, LightGlue), deep edge detection (DexiNed), image stitching on large images, or any workflow where GPU acceleration is desired. The GPU environment enables FlashAttention, mixed-precision training, and faster convolution-based compute paths.

System Requirements

Category	Requirement	Notes
OS	Linux (recommended), Windows	macOS uses MPS instead of CUDA
Hardware	NVIDIA GPU with CUDA support	Minimum compute capability varies by PyTorch version
CUDA	12.1 or 12.4	Configured via PyTorch custom index URLs
Disk	~2GB additional	For CUDA-enabled PyTorch

Dependencies

System Packages

NVIDIA GPU drivers (compatible with CUDA version)
CUDA Toolkit 12.1 or 12.4 (bundled with PyTorch CUDA builds)

Python Packages

`torch` >= 2.0.0 (CUDA build, e.g., `torch+cu121` or `torch+cu124`)
All core Kornia dependencies (see PyTorch_Python_Environment)
`xformers` (optional, for DeDoDe memory-efficient attention)
`flash-attn` (optional, for LightGlue FlashAttention)

Credentials

No credentials required for GPU usage.

Quick Install

# Install PyTorch with CUDA 12.1
pip install torch --index-url https://download.pytorch.org/whl/cu121

# Install PyTorch with CUDA 12.4
pip install torch --index-url https://download.pytorch.org/whl/cu124

# Install Kornia
pip install kornia

# Optional: xformers for DeDoDe
pip install xformers

# Via pixi (project's recommended approach)
pixi run -e cuda install

Code Evidence

CUDA device detection from `kornia/core/utils.py:45-58`:

def get_cuda_device_if_available(index: int = 0) -> torch.device:
    """Try to get cuda device, if fail, return cpu."""
    if torch.cuda.is_available():
        return torch.device(f"cuda:{index}")
    return torch.device("cpu")

FlashAttention availability check from `kornia/feature/lightglue.py:140-142`:

warnings.warn(
    "FlashAttention is not available. For optimal speed, "
    "consider installing torch >= 2.0 or flash-attn.",
)

CUDA-specific FlashAttention activation from `kornia/feature/lightglue.py:149-154`:

torch.backends.cuda.enable_flash_sdp(allow_flash)
# ...
if self.enable_flash and q.device.type == "cuda":

Mixed-precision autocast from `kornia/feature/lightglue.py:576`:

with torch.autocast(enabled=self.conf.mp, device_type="cuda"):

xformers optional dependency for DeDoDe from `kornia/feature/dedode/transformer/layers/attention.py:35-40`:

try:
    from xformers.ops import fmha, memory_efficient_attention, unbind
    XFORMERS_AVAILABLE = True
except ImportError:
    XFORMERS_AVAILABLE = False

CUDA index URLs from `pyproject.toml:309-317`:

[[tool.uv.index]]
name = "pytorch-cu121"
url = "https://download.pytorch.org/whl/cu121"
explicit = true

[[tool.uv.index]]
name = "pytorch-cu124"
url = "https://download.pytorch.org/whl/cu124"
explicit = true

Common Errors

Error Message	Cause	Solution
`CUDA out of memory`	Insufficient GPU VRAM	Reduce batch size or image resolution; use gradient checkpointing
`FlashAttention is not available`	torch < 2.0 or flash-attn not installed	Upgrade to torch >= 2.0 or install flash-attn package
`xFormers not available`	xformers not installed	`pip install xformers` (only needed for DeDoDe)
`RuntimeError: CUDA error`	CUDA driver/toolkit mismatch	Ensure GPU drivers match the CUDA version used by PyTorch

Compatibility Notes

MPS (Apple Silicon): Kornia has MPS-specific workarounds (e.g., `is_mps_tensor_safe()` in `kornia/core/utils.py:40-42`). Some ops like `torch.std_mean` behave differently on MPS and have explicit fallbacks.
XLA/TPU: Experimental support via `torch_xla`. Detected by `xla_is_available()` in `kornia/core/utils.py:33-37`.
FP16 ONNX Inference: Requires CUDA device. `KORNIA_CHECK(device.type == "cuda", "FP16 requires CUDA.")` in `kornia/feature/lightglue_onnx/lightglue.py:83`.
CPU Fallback: All Kornia operations work on CPU. GPU code paths are conditional and gracefully fall back to CPU.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment