Implementation:Speechbrain Speechbrain DNS Noisy Speech Synthesizer
| Knowledge Sources | |
|---|---|
| Domains | Speech_Enhancement, Data_Synthesis |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
Concrete tool for synthesizing noisy speech training data from clean and noise sources provided by the SpeechBrain library.
Description
This script synthesizes clean-noisy pairs of audio for training speech enhancement models, following the Microsoft DNS Challenge protocol. It reads clean speech and noise source files from webdataset shards, constructs audio signals of specified length by concatenating clips with silence padding, optionally applies room impulse response (RIR) convolution for reverberation, and mixes them at randomly sampled SNR levels using segmental SNR mixing. The output is stored as webdataset shards containing the clean, noise, and noisy speech signals along with metadata. Each clean speech source file is used once (single-process mode), with activity detection ensuring adequate speech energy. The script supports both training and validation set generation with configurable shard sizes.
Originally sourced from the Microsoft DNS-Challenge repository and further modified for webdataset integration by Sangeet Sagar (2023).
Usage
Run as a standalone script with a HyperPyYAML configuration file specifying paths to clean/noise webdataset shards, output directories, SNR ranges, audio length, and other synthesis parameters.
Code Reference
Source Location
- Repository: SpeechBrain
- File: recipes/DNS/noisyspeech_synthesizer/noisyspeech_synthesizer_singleprocess.py
Signature
def add_pyreverb(clean_speech, rir):
"""Add reverb to clean signal using FFT convolution."""
...
def build_audio(is_clean, params, index, audio_samples_length=-1):
"""Construct an audio signal from source files."""
...
def gen_audio(is_clean, params, index, audio_samples_length=-1):
"""Calls build_audio() to get an audio signal and verify activity threshold."""
...
def main_gen(params):
"""Generate audio signals, verify requirements, and write files to storage."""
...
Import
python noisyspeech_synthesizer_singleprocess.py hparams/synthesizer.yaml
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| params | dict | Yes | HyperPyYAML config dict with fs, audio_length, silence_length, SNR ranges, etc. |
| is_clean | bool | Yes | Whether to build a clean speech signal or noise signal |
| index | int | Yes | Current index into the data iterator |
| clean_speech | np.ndarray | Yes (add_pyreverb) | Clean speech signal array |
| rir | np.ndarray | Yes (add_pyreverb) | Room impulse response array |
| clean_data | webdataset | Yes | Webdataset shard iterator for clean speech |
| noise_data | webdataset | Yes | Webdataset shard iterator for noise |
| train_shard_destination | str | Yes | Output path for training shards |
| valid_shard_destination | str | Yes | Output path for validation shards |
Outputs
| Name | Type | Description |
|---|---|---|
| webdataset shards | .tar files | Output shards containing clean, noise, and noisy audio pairs |
| output_audio | np.ndarray | Constructed audio signal from concatenated clips |
| files_used | list | List of source file keys used in audio construction |
| clipped_files | list | List of source files that were skipped due to clipping |
Usage Examples
# Run noisy speech synthesis with configuration
python noisyspeech_synthesizer_singleprocess.py hparams/synthesizer.yaml
# With overrides for output paths
python noisyspeech_synthesizer_singleprocess.py hparams/synthesizer.yaml \
--train_shard_destination /data/dns/train_shards \
--valid_shard_destination /data/dns/valid_shards