Implementation:Datajuicer Data juicer VideoTaggingFromFramesMapper
| Knowledge Sources | |
|---|---|
| Domains | Data_Processing, Mapping |
| Last Updated | 2026-02-14 16:00 GMT |
Overview
Concrete tool for generating semantic tags from video frames provided by Data-Juicer.
Description
VideoTaggingFromFramesMapper extracts frames from videos using either keyframe or uniform sampling, processes frame tensors through a pre-trained Recognize Anything Model (RAM) using a HuggingFace tokenizer, aggregates tags across all frames, sorts them by frequency, and stores the resulting tag array in the sample metadata under a configurable field name. If tags are already present in the sample, the operator skips processing. If no video is present, an empty tag array is stored.
Usage
Use when you need to automatically generate semantic content tags for videos by analyzing their visual frames, enabling visual content tagging for video datasets as input to downstream captioning or filtering pipelines.
Code Reference
Source Location
- Repository: Datajuicer_Data_juicer
- File: data_juicer/ops/mapper/video_tagging_from_frames_mapper.py
Signature
@OPERATORS.register_module("video_tagging_from_frames_mapper")
class VideoTaggingFromFramesMapper(Mapper):
def __init__(self, frame_sampling_method: str = "all_keyframes",
frame_num: PositiveInt = 3,
tag_field_name: str = MetaKeys.video_frame_tags,
*args, **kwargs):
Import
from data_juicer.ops.mapper.video_tagging_from_frames_mapper import VideoTaggingFromFramesMapper
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| frame_sampling_method | str | No | Method for extracting frames: "all_keyframes" or "uniform". Default: "all_keyframes" |
| frame_num | PositiveInt | No | Number of frames to extract uniformly (only used when method is "uniform"). Default: 3 |
| tag_field_name | str | No | Field name to store the generated tags. Default: "video_frame_tags" |
Outputs
| Name | Type | Description |
|---|---|---|
| sample[Fields.meta][tag_field_name] | list of numpy arrays | Per-video list of tag arrays sorted by frequency across frames |
Usage Examples
process:
- video_tagging_from_frames_mapper:
frame_sampling_method: "uniform"
frame_num: 5