Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Datajuicer Data juicer AudioFFmpegWrappedMapper

From Leeroopedia
Knowledge Sources
Domains Data_Processing, Mapping
Last Updated 2026-02-14 16:00 GMT

Overview

Concrete tool for applying FFmpeg audio filters to audio files in a dataset provided by Data-Juicer.

Description

AudioFFmpegWrappedMapper is a mapper operator that wraps FFmpeg audio filters for flexible audio processing. It uses the ffmpeg-python library to apply a specified FFmpeg filter with custom keyword arguments and global arguments to each audio file in a sample. Processed audio files are saved to a configurable output directory, and the sample's source file paths are updated accordingly. If no filter name is provided, the audio files remain unmodified. It extends the Mapper base class.

Usage

Import when you need to apply arbitrary FFmpeg audio filters to audio files without writing custom operator code.

Code Reference

Source Location

Signature

@OPERATORS.register_module("audio_ffmpeg_wrapped_mapper")
class AudioFFmpegWrappedMapper(Mapper):
    def __init__(self,
                 filter_name: Optional[str] = None,
                 filter_kwargs: Optional[Dict] = None,
                 global_args: Optional[List[str]] = None,
                 capture_stderr: bool = True,
                 overwrite_output: bool = True,
                 save_dir: str = None,
                 *args, **kwargs):

Import

from data_juicer.ops.mapper.audio_ffmpeg_wrapped_mapper import AudioFFmpegWrappedMapper

I/O Contract

Inputs

Name Type Required Description
filter_name Optional[str] No FFmpeg audio filter name to apply. Default: None (no-op)
filter_kwargs Optional[Dict] No Keyword arguments passed to the FFmpeg filter. Default: None
global_args Optional[List[str]] No List arguments passed to the FFmpeg command-line. Default: None
capture_stderr bool No Whether to capture stderr output. Default: True
overwrite_output bool No Whether to overwrite existing output files. Default: True
save_dir str No Directory to store generated audio files. If not specified, outputs are saved alongside inputs. Can also be set via DJ_PRODUCED_DATA_DIR environment variable

Outputs

Name Type Description
samples Dict Transformed samples with processed audio file paths updated

Usage Examples

YAML Configuration

process:
  - audio_ffmpeg_wrapped_mapper:
      filter_name: afade
      filter_kwargs:
        type: in
        duration: 3

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment