Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Datajuicer Data juicer VideoCaptioningFromAudioMapper

From Leeroopedia
Knowledge Sources
Domains Data_Processing, Mapping
Last Updated 2026-02-14 16:00 GMT

Overview

Concrete tool for generating video captions from audio streams provided by Data-Juicer.

Description

VideoCaptioningFromAudioMapper is a mapper operator that generates text captions for videos based on their audio streams using the Qwen-Audio model. It extracts audio streams from each video, processes them through the Qwen-Audio HuggingFace model with a transcription/captioning prompt, strips special tokens from the output using regex, and inserts the generated captions into the sample text, optionally keeping the original sample alongside the captioned version.

Usage

Use when you need multimodal video understanding by capturing information from the audio channel, particularly valuable for videos where visual content alone is insufficient such as narrated content or dialogue-heavy scenes.

Code Reference

Source Location

Signature

@OPERATORS.register_module("video_captioning_from_audio_mapper")
class VideoCaptioningFromAudioMapper(Mapper):
    def __init__(self, keep_original_sample: bool = True, *args, **kwargs):

Import

from data_juicer.ops.mapper.video_captioning_from_audio_mapper import VideoCaptioningFromAudioMapper

I/O Contract

Inputs

Name Type Required Description
keep_original_sample bool No Whether to keep the original sample alongside the captioned version (default: True)

Outputs

Name Type Description
samples Dict Transformed samples with audio-derived captions inserted into text

Usage Examples

process:
  - video_captioning_from_audio_mapper:
      keep_original_sample: true

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment