Implementation:NVIDIA NeMo Curator VideoReaderStage
Appearance
| Knowledge Sources | |
|---|---|
| Domains | Data_Curation, Video_Processing |
| Last Updated | 2026-02-14 17:00 GMT |
Overview
Concrete tool for reading video files and extracting metadata provided by NeMo Curator.
Description
The VideoReaderStage reads video files from the local filesystem and extracts comprehensive metadata including dimensions, frame rate, duration, codecs, and other technical properties. It stores results in a VideoTask object containing the video source bytes and metadata.
Usage
Import this stage when building a video curation pipeline that needs to read raw video files from storage. Combine with FilePartitioningStage via the VideoReader composite stage for a complete ingestion solution.
Code Reference
Source Location
- Repository: NeMo Curator
- File: nemo_curator/stages/video/io/video_reader.py
- Lines: L79-290
Signature
@dataclass
class VideoReaderStage(ProcessingStage[FileGroupTask, VideoTask]):
input_path: str | None = None
verbose: bool = False
name: str = "video_reader"
Import
from nemo_curator.stages.video.io.video_reader import VideoReaderStage
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| task | FileGroupTask | Yes | List of video file paths to read |
Outputs
| Name | Type | Description |
|---|---|---|
| task | VideoTask | Contains video.source_bytes and video.metadata (duration, fps, resolution, codec) |
Usage Examples
from nemo_curator.stages.video.io.video_reader import VideoReader
from nemo_curator.pipeline import Pipeline
reader = VideoReader(
input_video_path="./data/videos",
video_limit=100,
verbose=True,
)
pipeline = Pipeline()
pipeline.add_stage(reader)
Related Pages
Implements Principle
- Principle:NVIDIA_NeMo_Curator_Video_Ingestion
- Environment:NVIDIA_NeMo_Curator_Python_Linux_Base
- Environment:NVIDIA_NeMo_Curator_Video_Codec_Stack
- Environment:NVIDIA_NeMo_Curator_Ray_Cluster
- Heuristic:NVIDIA_NeMo_Curator_GPU_Memory_Resource_Allocation
- Heuristic:NVIDIA_NeMo_Curator_Video_Frame_Sampling_Strategy
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment