Environment:NVIDIA NeMo Curator Video Codec Stack

Knowledge Sources	NVIDIA NeMo Curator PyNvVideoCodec
Domains	Infrastructure, Video_Processing, GPU_Computing
Last Updated	2026-02-14 16:45 GMT

Overview

NVIDIA Video Codec SDK stack (PyNvVideoCodec, CV-CUDA, PyCUDA) with flash-attn for GPU-accelerated video decoding and processing.

Description

This environment provides GPU-accelerated video decoding and frame processing capabilities. PyNvVideoCodec uses NVIDIA's hardware NVDEC/NVENC units for video decode/encode, CV-CUDA provides GPU-accelerated computer vision operations (color conversion, resizing), and PyCUDA provides low-level CUDA stream management. Flash Attention is included for efficient attention computation in vision-language models used for video captioning.

Usage

Required for the Video Curation Pipeline when GPU acceleration is desired. The `VideoBatchDecoder` class uses these libraries for hardware-accelerated video frame extraction. Without this stack, video processing falls back to CPU-based decoding via `av` (PyAV) and OpenCV.

System Requirements

Category	Requirement	Notes
OS	Linux	All video codec libs are Linux-only
Architecture	x86_64	PyNvVideoCodec and flash-attn are x86_64-only
Hardware	NVIDIA GPU with NVDEC/NVENC	Hardware video decode/encode units
VRAM	10-20GB per worker	TransNetV2 needs ~10GB, Cosmos-Embed1 needs ~20GB
CUDA	CUDA 12.x	Required by cvcuda_cu12 and PyNvVideoCodec
PyTorch	torch <= 2.9.1	Upper-bounded for video_cuda12 compatibility

Dependencies

Python Packages

`PyNvVideoCodec` == 2.0.2 (x86_64 Linux only)
`cvcuda_cu12` (CV-CUDA for CUDA 12)
`pycuda`
`flash-attn` <= 2.8.3 (x86_64 Linux only)
`torch` <= 2.9.1
`torchaudio`
`av` == 13.1.0 (CPU fallback video I/O)
`opencv-python`
`torchvision`
`einops`
`easydict`

Credentials

No credentials required for the video codec stack itself. Vision-language models for captioning may require `HF_TOKEN` for gated model access.

Quick Install

# Install NeMo Curator with full video curation support
pip install "nemo-curator[video_cuda12]"

Code Evidence

Optional import with graceful degradation from `nemo_curator/utils/nvcodec_utils.py:23-39`:

try:
    import cvcuda
    import nvcv
    import pycuda.driver as cuda
    import PyNvVideoCodec as Nvc

    pixel_format_to_cvcuda_code = {
        Nvc.Pixel_Format.YUV444: cvcuda.ColorConversion.YUV2RGB,
        Nvc.Pixel_Format.NV12: cvcuda.ColorConversion.YUV2RGB_NV12,
    }
except (ImportError, RuntimeError):
    logger.warning("PyNvVideoCodec is not installed, some features will be disabled.")
    Nvc = None
    cvcuda = None
    nvcv = None
    cuda = None
    pixel_format_to_cvcuda_code = {}

Platform constraints from `pyproject.toml:150-153`:

"flash-attn<=2.8.3; (platform_machine == 'x86_64' and platform_system != 'Darwin')",
"pycuda",
"PyNvVideoCodec==2.0.2; (platform_machine == 'x86_64' and platform_system != 'Darwin')",
"torch<=2.9.1",

Common Errors

Error Message	Cause	Solution
`PyNvVideoCodec is not installed, some features will be disabled.`	PyNvVideoCodec not installed	`pip install PyNvVideoCodec==2.0.2` (x86_64 Linux only)
`ImportError: pycuda`	PyCUDA not installed	`pip install pycuda`
`RuntimeError` during cvcuda import	CUDA driver mismatch	Ensure CUDA 12 driver and toolkit are installed
flash-attn build failure	Missing CUDA dev tools for compilation	Install CUDA toolkit dev packages; use `--no-build-isolation`

Compatibility Notes

CPU fallback: When PyNvVideoCodec is unavailable, video processing falls back to CPU-based decoding via PyAV (`av` library) and OpenCV. This is significantly slower but functional.
ARM (aarch64): PyNvVideoCodec and flash-attn are not available on ARM. CPU fallback is used automatically.
macOS: Not supported. All video codec libraries are Linux-only.
Build isolation: `flash-attn` requires `no-build-isolation` (configured in `pyproject.toml`).

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment