Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Datajuicer Data juicer AudioAddGaussianNoiseMapper

From Leeroopedia
Knowledge Sources
Domains Data_Processing, Mapping
Last Updated 2026-02-14 16:00 GMT

Overview

Concrete tool for adding Gaussian noise to audio samples provided by Data-Juicer.

Description

AudioAddGaussianNoiseMapper is a mapper operator that adds Gaussian noise to audio data with configurable amplitude range and application probability. It uses the audiomentations library's AddGaussianNoise transform to apply noise to loaded audio files. For each audio file in a sample, it applies the transform with probability `p` and saves the modified audio to a specified output directory (or the input directory) using the soundfile library. If no audio is present in the sample, it is returned unchanged. It extends the Mapper base class.

Usage

Import when you need to augment audio data with Gaussian noise for improving model robustness to noisy inputs.

Code Reference

Source Location

Signature

@OPERATORS.register_module("audio_add_gaussian_noise_mapper")
class AudioAddGaussianNoiseMapper(Mapper):
    def __init__(self,
                 min_amplitude: float = 0.001,
                 max_amplitude: float = 0.015,
                 p: float = 0.5,
                 save_dir: str = None,
                 *args, **kwargs):

Import

from data_juicer.ops.mapper.audio_add_gaussian_noise_mapper import AudioAddGaussianNoiseMapper

I/O Contract

Inputs

Name Type Required Description
min_amplitude float No Minimum noise amplification factor (linear amplitude). Default: 0.001
max_amplitude float No Maximum noise amplification factor (linear amplitude). Default: 0.015
p float No Probability of applying the transform, range [0.0, 1.0]. Default: 0.5
save_dir str No Directory to store generated audio files. If not specified, outputs are saved alongside inputs. Can also be set via DJ_PRODUCED_DATA_DIR environment variable

Outputs

Name Type Description
samples Dict Transformed samples with noise-augmented audio file paths updated

Usage Examples

YAML Configuration

process:
  - audio_add_gaussian_noise_mapper:
      min_amplitude: 0.001
      max_amplitude: 0.015
      p: 0.5

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment