Implementation:Huggingface Datasets Video
| Knowledge Sources | |
|---|---|
| Domains | Data_Engineering, NLP |
| Last Updated | 2026-02-14 18:00 GMT |
Overview
Concrete tool for handling video data with frame extraction and decoding support provided by the HuggingFace Datasets library.
Description
Video is a dataclass feature type for video data. It accepts file paths (str or pathlib.Path), dictionaries with "path"/"bytes" keys, or torchcodec.decoders.VideoDecoder objects. Video data is stored in Arrow as a struct with bytes (binary) and path (string) fields. When decoded (default), accessing video data returns torchcodec.decoders.VideoDecoder objects that support frame-level random access via methods like get_frames_in_range(). Configuration options control dimension ordering, FFmpeg threads, device, seek mode, and stream index.
Usage
Use Video as a feature type for any column containing video files. Configure decode parameters to control frame access behavior.
Code Reference
Source Location
- Repository: datasets
- File:
src/datasets/features/video.py - Lines: 29-331
Signature
@dataclass
class Video:
decode: bool = True
stream_index: Optional[int] = None
dimension_order: Literal["NCHW", "NHWC"] = "NCHW"
num_ffmpeg_threads: int = 1
device: Optional[Union[str, "torch.device"]] = "cpu"
seek_mode: Literal["exact", "approximate"] = "exact"
id: Optional[str] = field(default=None, repr=False)
# Automatically constructed
dtype: ClassVar[str] = "torchcodec.decoders.VideoDecoder"
pa_type: ClassVar[Any] = pa.struct({"bytes": pa.binary(), "path": pa.string()})
_type: str = field(default="Video", init=False, repr=False)
Import
from datasets import Video
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| decode | bool |
No | Whether to decode video on access. Defaults to True. |
| stream_index | int |
No | Streaming index to use. None defaults to "best". |
| dimension_order | str |
No | Frame dimension order: "NCHW" (default) or "NHWC". |
| num_ffmpeg_threads | int |
No | Number of FFmpeg decoding threads. Defaults to 1. |
| device | str or torch.device |
No | Decoding device. Defaults to "cpu". |
| seek_mode | str |
No | Frame seek mode: "exact" (default) or "approximate". |
| id | str |
No | Optional feature identifier. |
Outputs
| Name | Type | Description |
|---|---|---|
| instance | Video |
A Video feature type for use in Features schemas. |
Usage Examples
Basic Usage
from datasets import Dataset, Video
ds = Dataset.from_dict(
{"video": ["path/to/video.mov"]},
).cast_column("video", Video())
# Access returns VideoDecoder objects
video = ds[0]["video"]
# <torchcodec.decoders._video_decoder.VideoDecoder object>
# Get specific frames
frames = video.get_frames_in_range(0, 10)