Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Speechbrain Speechbrain DNS Noisy Speech Synthesizer

From Leeroopedia


Knowledge Sources
Domains Speech_Enhancement, Data_Synthesis
Last Updated 2026-02-09 00:00 GMT

Overview

Concrete tool for synthesizing noisy speech training data from clean and noise sources provided by the SpeechBrain library.

Description

This script synthesizes clean-noisy pairs of audio for training speech enhancement models, following the Microsoft DNS Challenge protocol. It reads clean speech and noise source files from webdataset shards, constructs audio signals of specified length by concatenating clips with silence padding, optionally applies room impulse response (RIR) convolution for reverberation, and mixes them at randomly sampled SNR levels using segmental SNR mixing. The output is stored as webdataset shards containing the clean, noise, and noisy speech signals along with metadata. Each clean speech source file is used once (single-process mode), with activity detection ensuring adequate speech energy. The script supports both training and validation set generation with configurable shard sizes.

Originally sourced from the Microsoft DNS-Challenge repository and further modified for webdataset integration by Sangeet Sagar (2023).

Usage

Run as a standalone script with a HyperPyYAML configuration file specifying paths to clean/noise webdataset shards, output directories, SNR ranges, audio length, and other synthesis parameters.

Code Reference

Source Location

Signature

def add_pyreverb(clean_speech, rir):
    """Add reverb to clean signal using FFT convolution."""
    ...

def build_audio(is_clean, params, index, audio_samples_length=-1):
    """Construct an audio signal from source files."""
    ...

def gen_audio(is_clean, params, index, audio_samples_length=-1):
    """Calls build_audio() to get an audio signal and verify activity threshold."""
    ...

def main_gen(params):
    """Generate audio signals, verify requirements, and write files to storage."""
    ...

Import

python noisyspeech_synthesizer_singleprocess.py hparams/synthesizer.yaml

I/O Contract

Inputs

Name Type Required Description
params dict Yes HyperPyYAML config dict with fs, audio_length, silence_length, SNR ranges, etc.
is_clean bool Yes Whether to build a clean speech signal or noise signal
index int Yes Current index into the data iterator
clean_speech np.ndarray Yes (add_pyreverb) Clean speech signal array
rir np.ndarray Yes (add_pyreverb) Room impulse response array
clean_data webdataset Yes Webdataset shard iterator for clean speech
noise_data webdataset Yes Webdataset shard iterator for noise
train_shard_destination str Yes Output path for training shards
valid_shard_destination str Yes Output path for validation shards

Outputs

Name Type Description
webdataset shards .tar files Output shards containing clean, noise, and noisy audio pairs
output_audio np.ndarray Constructed audio signal from concatenated clips
files_used list List of source file keys used in audio construction
clipped_files list List of source files that were skipped due to clipping

Usage Examples

# Run noisy speech synthesis with configuration
python noisyspeech_synthesizer_singleprocess.py hparams/synthesizer.yaml

# With overrides for output paths
python noisyspeech_synthesizer_singleprocess.py hparams/synthesizer.yaml \
    --train_shard_destination /data/dns/train_shards \
    --valid_shard_destination /data/dns/valid_shards

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment