Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Environment:NVIDIA NeMo Curator Video Codec Stack

From Leeroopedia
Knowledge Sources
Domains Infrastructure, Video_Processing, GPU_Computing
Last Updated 2026-02-14 16:45 GMT

Overview

NVIDIA Video Codec SDK stack (PyNvVideoCodec, CV-CUDA, PyCUDA) with flash-attn for GPU-accelerated video decoding and processing.

Description

This environment provides GPU-accelerated video decoding and frame processing capabilities. PyNvVideoCodec uses NVIDIA's hardware NVDEC/NVENC units for video decode/encode, CV-CUDA provides GPU-accelerated computer vision operations (color conversion, resizing), and PyCUDA provides low-level CUDA stream management. Flash Attention is included for efficient attention computation in vision-language models used for video captioning.

Usage

Required for the Video Curation Pipeline when GPU acceleration is desired. The `VideoBatchDecoder` class uses these libraries for hardware-accelerated video frame extraction. Without this stack, video processing falls back to CPU-based decoding via `av` (PyAV) and OpenCV.

System Requirements

Category Requirement Notes
OS Linux All video codec libs are Linux-only
Architecture x86_64 PyNvVideoCodec and flash-attn are x86_64-only
Hardware NVIDIA GPU with NVDEC/NVENC Hardware video decode/encode units
VRAM 10-20GB per worker TransNetV2 needs ~10GB, Cosmos-Embed1 needs ~20GB
CUDA CUDA 12.x Required by cvcuda_cu12 and PyNvVideoCodec
PyTorch torch <= 2.9.1 Upper-bounded for video_cuda12 compatibility

Dependencies

Python Packages

  • `PyNvVideoCodec` == 2.0.2 (x86_64 Linux only)
  • `cvcuda_cu12` (CV-CUDA for CUDA 12)
  • `pycuda`
  • `flash-attn` <= 2.8.3 (x86_64 Linux only)
  • `torch` <= 2.9.1
  • `torchaudio`
  • `av` == 13.1.0 (CPU fallback video I/O)
  • `opencv-python`
  • `torchvision`
  • `einops`
  • `easydict`

Credentials

No credentials required for the video codec stack itself. Vision-language models for captioning may require `HF_TOKEN` for gated model access.

Quick Install

# Install NeMo Curator with full video curation support
pip install "nemo-curator[video_cuda12]"

Code Evidence

Optional import with graceful degradation from `nemo_curator/utils/nvcodec_utils.py:23-39`:

try:
    import cvcuda
    import nvcv
    import pycuda.driver as cuda
    import PyNvVideoCodec as Nvc

    pixel_format_to_cvcuda_code = {
        Nvc.Pixel_Format.YUV444: cvcuda.ColorConversion.YUV2RGB,
        Nvc.Pixel_Format.NV12: cvcuda.ColorConversion.YUV2RGB_NV12,
    }
except (ImportError, RuntimeError):
    logger.warning("PyNvVideoCodec is not installed, some features will be disabled.")
    Nvc = None
    cvcuda = None
    nvcv = None
    cuda = None
    pixel_format_to_cvcuda_code = {}

Platform constraints from `pyproject.toml:150-153`:

"flash-attn<=2.8.3; (platform_machine == 'x86_64' and platform_system != 'Darwin')",
"pycuda",
"PyNvVideoCodec==2.0.2; (platform_machine == 'x86_64' and platform_system != 'Darwin')",
"torch<=2.9.1",

Common Errors

Error Message Cause Solution
`PyNvVideoCodec is not installed, some features will be disabled.` PyNvVideoCodec not installed `pip install PyNvVideoCodec==2.0.2` (x86_64 Linux only)
`ImportError: pycuda` PyCUDA not installed `pip install pycuda`
`RuntimeError` during cvcuda import CUDA driver mismatch Ensure CUDA 12 driver and toolkit are installed
flash-attn build failure Missing CUDA dev tools for compilation Install CUDA toolkit dev packages; use `--no-build-isolation`

Compatibility Notes

  • CPU fallback: When PyNvVideoCodec is unavailable, video processing falls back to CPU-based decoding via PyAV (`av` library) and OpenCV. This is significantly slower but functional.
  • ARM (aarch64): PyNvVideoCodec and flash-attn are not available on ARM. CPU fallback is used automatically.
  • macOS: Not supported. All video codec libraries are Linux-only.
  • Build isolation: `flash-attn` requires `no-build-isolation` (configured in `pyproject.toml`).

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment