Implementation:Huggingface Datasets Video

Knowledge Sources	Huggingface Datasets HF Datasets Docs
Domains	Data_Engineering, NLP
Last Updated	2026-02-14 18:00 GMT

Overview

Concrete tool for handling video data with frame extraction and decoding support provided by the HuggingFace Datasets library.

Description

Video is a dataclass feature type for video data. It accepts file paths (str or pathlib.Path), dictionaries with "path"/"bytes" keys, or torchcodec.decoders.VideoDecoder objects. Video data is stored in Arrow as a struct with bytes (binary) and path (string) fields. When decoded (default), accessing video data returns torchcodec.decoders.VideoDecoder objects that support frame-level random access via methods like get_frames_in_range(). Configuration options control dimension ordering, FFmpeg threads, device, seek mode, and stream index.

Usage

Use Video as a feature type for any column containing video files. Configure decode parameters to control frame access behavior.

Code Reference

Source Location

Repository: datasets
File: src/datasets/features/video.py
Lines: 29-331

Signature

@dataclass
class Video:
    decode: bool = True
    stream_index: Optional[int] = None
    dimension_order: Literal["NCHW", "NHWC"] = "NCHW"
    num_ffmpeg_threads: int = 1
    device: Optional[Union[str, "torch.device"]] = "cpu"
    seek_mode: Literal["exact", "approximate"] = "exact"
    id: Optional[str] = field(default=None, repr=False)
    # Automatically constructed
    dtype: ClassVar[str] = "torchcodec.decoders.VideoDecoder"
    pa_type: ClassVar[Any] = pa.struct({"bytes": pa.binary(), "path": pa.string()})
    _type: str = field(default="Video", init=False, repr=False)

Import

from datasets import Video

I/O Contract

Inputs

Name	Type	Required	Description
decode	`bool`	No	Whether to decode video on access. Defaults to True.
stream_index	`int`	No	Streaming index to use. None defaults to "best".
dimension_order	`str`	No	Frame dimension order: "NCHW" (default) or "NHWC".
num_ffmpeg_threads	`int`	No	Number of FFmpeg decoding threads. Defaults to 1.
device	`str or torch.device`	No	Decoding device. Defaults to "cpu".
seek_mode	`str`	No	Frame seek mode: "exact" (default) or "approximate".
id	`str`	No	Optional feature identifier.

Outputs

Name	Type	Description
instance	`Video`	A Video feature type for use in Features schemas.

Usage Examples

Basic Usage

from datasets import Dataset, Video

ds = Dataset.from_dict(
    {"video": ["path/to/video.mov"]},
).cast_column("video", Video())

# Access returns VideoDecoder objects
video = ds[0]["video"]
# <torchcodec.decoders._video_decoder.VideoDecoder object>

# Get specific frames
frames = video.get_frames_in_range(0, 10)

Related Pages

Implements Principle

Principle:Huggingface_Datasets_Video_Feature_Handling

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment