Principle:NVIDIA NeMo Curator Video Export
| Knowledge Sources | |
|---|---|
| Domains | Data_Curation, Video_Processing, Data_Engineering |
| Last Updated | 2026-02-14 17:00 GMT |
Overview
Technique for persisting curated video clips, metadata, embeddings, captions, and previews to structured storage for downstream consumption.
Description
Video Export handles writing all artifacts of the video curation pipeline to organized storage: MP4 clip files, metadata JSON, preview WebP images, embedding pickles, and embedding parquet files. The structured output enables efficient retrieval and integration with training frameworks.
Usage
Use as the final stage in a video curation pipeline after all processing, filtering, captioning, and embedding stages are complete.
Theoretical Basis
Video export organizes artifacts into parallel directory structures:
- clips/ - Transcoded MP4 video clips
- metas/ - Per-clip metadata JSON files
- previews/ - WebP preview images
- ce1_embd/ - Cosmos-Embed1 embedding pickles
- ce1_embd_parquet/ - Embedding vectors in Parquet format for deduplication