Implementation:Datajuicer Data juicer VideoTaggingFromAudioMapper

Knowledge Sources	Datajuicer_Data_juicer
Domains	Data_Processing, Mapping
Last Updated	2026-02-14 16:00 GMT

Overview

Concrete tool for generating video tags from audio streams provided by Data-Juicer.

Description

VideoTaggingFromAudioMapper is a mapper operator that generates semantic tags for videos based on their audio streams using the Audio Spectrogram Transformer (AST) model. It extracts audio from each video, resamples to the model's required sampling rate (16kHz), feeds the audio waveform through a HuggingFace AST model (default: MIT/ast-finetuned-audioset-10-10-0.4593), and selects the tag with the highest logit value, storing it in the sample metadata under a configurable field name, with "EMPTY" for videos without valid audio.

Usage

Use when you need audio-based content classification for video datasets, complementing visual tagging approaches and supporting multimodal data annotation workflows.

Code Reference

Source Location

Repository: Datajuicer_Data_juicer
File: data_juicer/ops/mapper/video_tagging_from_audio_mapper.py

Signature

@OPERATORS.register_module("video_tagging_from_audio_mapper")
class VideoTaggingFromAudioMapper(Mapper):
    def __init__(self, hf_ast: str = "MIT/ast-finetuned-audioset-10-10-0.4593", trust_remote_code: bool = False, tag_field_name: str = MetaKeys.video_audio_tags, *args, **kwargs):

Import

from data_juicer.ops.mapper.video_tagging_from_audio_mapper import VideoTaggingFromAudioMapper

I/O Contract

Inputs

Name	Type	Required	Description
hf_ast	str	No	Path to the HuggingFace AST model (default: "MIT/ast-finetuned-audioset-10-10-0.4593")
trust_remote_code	bool	No	Whether to trust remote code of HF models (default: False)
tag_field_name	str	No	Field name to store the tags (default: "video_audio_tags")

Outputs

Name	Type	Description
samples	Dict	Transformed samples with audio-derived tags in metadata

Usage Examples

process:
  - video_tagging_from_audio_mapper:
      hf_ast: "MIT/ast-finetuned-audioset-10-10-0.4593"

Related Pages

Environment:Datajuicer_Data_juicer_Python_Runtime_Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment