Implementation:Speechbrain Speechbrain DNS Noisy Speech Synthesizer

Knowledge Sources	SpeechBrain
Domains	Speech_Enhancement, Data_Synthesis
Last Updated	2026-02-09 00:00 GMT

Overview

Concrete tool for synthesizing noisy speech training data from clean and noise sources provided by the SpeechBrain library.

Description

This script synthesizes clean-noisy pairs of audio for training speech enhancement models, following the Microsoft DNS Challenge protocol. It reads clean speech and noise source files from webdataset shards, constructs audio signals of specified length by concatenating clips with silence padding, optionally applies room impulse response (RIR) convolution for reverberation, and mixes them at randomly sampled SNR levels using segmental SNR mixing. The output is stored as webdataset shards containing the clean, noise, and noisy speech signals along with metadata. Each clean speech source file is used once (single-process mode), with activity detection ensuring adequate speech energy. The script supports both training and validation set generation with configurable shard sizes.

Originally sourced from the Microsoft DNS-Challenge repository and further modified for webdataset integration by Sangeet Sagar (2023).

Usage

Run as a standalone script with a HyperPyYAML configuration file specifying paths to clean/noise webdataset shards, output directories, SNR ranges, audio length, and other synthesis parameters.

Code Reference

Source Location

Repository: SpeechBrain
File: recipes/DNS/noisyspeech_synthesizer/noisyspeech_synthesizer_singleprocess.py

Signature

def add_pyreverb(clean_speech, rir):
    """Add reverb to clean signal using FFT convolution."""
    ...

def build_audio(is_clean, params, index, audio_samples_length=-1):
    """Construct an audio signal from source files."""
    ...

def gen_audio(is_clean, params, index, audio_samples_length=-1):
    """Calls build_audio() to get an audio signal and verify activity threshold."""
    ...

def main_gen(params):
    """Generate audio signals, verify requirements, and write files to storage."""
    ...

Import

python noisyspeech_synthesizer_singleprocess.py hparams/synthesizer.yaml

I/O Contract

Inputs

Name	Type	Required	Description
params	dict	Yes	HyperPyYAML config dict with fs, audio_length, silence_length, SNR ranges, etc.
is_clean	bool	Yes	Whether to build a clean speech signal or noise signal
index	int	Yes	Current index into the data iterator
clean_speech	np.ndarray	Yes (add_pyreverb)	Clean speech signal array
rir	np.ndarray	Yes (add_pyreverb)	Room impulse response array
clean_data	webdataset	Yes	Webdataset shard iterator for clean speech
noise_data	webdataset	Yes	Webdataset shard iterator for noise
train_shard_destination	str	Yes	Output path for training shards
valid_shard_destination	str	Yes	Output path for validation shards

Outputs

Name	Type	Description
webdataset shards	.tar files	Output shards containing clean, noise, and noisy audio pairs
output_audio	np.ndarray	Constructed audio signal from concatenated clips
files_used	list	List of source file keys used in audio construction
clipped_files	list	List of source files that were skipped due to clipping

Usage Examples

# Run noisy speech synthesis with configuration
python noisyspeech_synthesizer_singleprocess.py hparams/synthesizer.yaml

# With overrides for output paths
python noisyspeech_synthesizer_singleprocess.py hparams/synthesizer.yaml \
    --train_shard_destination /data/dns/train_shards \
    --valid_shard_destination /data/dns/valid_shards

Related Pages

Principle:Speechbrain_Speechbrain_Noisy_Speech_Data_Preparation

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment