Implementation:Datajuicer Data juicer AudioAddGaussianNoiseMapper
| Knowledge Sources | |
|---|---|
| Domains | Data_Processing, Mapping |
| Last Updated | 2026-02-14 16:00 GMT |
Overview
Concrete tool for adding Gaussian noise to audio samples provided by Data-Juicer.
Description
AudioAddGaussianNoiseMapper is a mapper operator that adds Gaussian noise to audio data with configurable amplitude range and application probability. It uses the audiomentations library's AddGaussianNoise transform to apply noise to loaded audio files. For each audio file in a sample, it applies the transform with probability `p` and saves the modified audio to a specified output directory (or the input directory) using the soundfile library. If no audio is present in the sample, it is returned unchanged. It extends the Mapper base class.
Usage
Import when you need to augment audio data with Gaussian noise for improving model robustness to noisy inputs.
Code Reference
Source Location
- Repository: Datajuicer_Data_juicer
- File: data_juicer/ops/mapper/audio_add_gaussian_noise_mapper.py
Signature
@OPERATORS.register_module("audio_add_gaussian_noise_mapper")
class AudioAddGaussianNoiseMapper(Mapper):
def __init__(self,
min_amplitude: float = 0.001,
max_amplitude: float = 0.015,
p: float = 0.5,
save_dir: str = None,
*args, **kwargs):
Import
from data_juicer.ops.mapper.audio_add_gaussian_noise_mapper import AudioAddGaussianNoiseMapper
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| min_amplitude | float | No | Minimum noise amplification factor (linear amplitude). Default: 0.001 |
| max_amplitude | float | No | Maximum noise amplification factor (linear amplitude). Default: 0.015 |
| p | float | No | Probability of applying the transform, range [0.0, 1.0]. Default: 0.5 |
| save_dir | str | No | Directory to store generated audio files. If not specified, outputs are saved alongside inputs. Can also be set via DJ_PRODUCED_DATA_DIR environment variable |
Outputs
| Name | Type | Description |
|---|---|---|
| samples | Dict | Transformed samples with noise-augmented audio file paths updated |
Usage Examples
YAML Configuration
process:
- audio_add_gaussian_noise_mapper:
min_amplitude: 0.001
max_amplitude: 0.015
p: 0.5