Principle:NVIDIA NeMo Curator Video Ingestion
| Knowledge Sources | |
|---|---|
| Domains | Data_Curation, Video_Processing |
| Last Updated | 2026-02-14 17:00 GMT |
Overview
Technique for reading raw video files from storage and extracting comprehensive technical metadata for downstream processing in video curation pipelines.
Description
Video Ingestion is the entry point of any video curation pipeline. It handles discovering video files in a directory, reading their binary content, and extracting metadata (resolution, frame rate, duration, codec information). The process uses FFmpeg probing for metadata extraction and supports both CPU and GPU-accelerated decoding paths.
Usage
Use this principle as the first step in any video curation pipeline. It should be followed by clipping, filtering, or embedding stages that require the video data and metadata.
Theoretical Basis
Video ingestion follows a two-phase approach:
- File Discovery: Partition video files by size (target ~1GiB per group) for balanced distributed processing
- Metadata Extraction: Probe each file using FFmpeg to extract duration, FPS, resolution, codec, and bitrate information