Implementation:Speechbrain Speechbrain Create Aishell1Mix Metadata
| Knowledge Sources | |
|---|---|
| Domains | Speech Separation, Data Preparation |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
Concrete tool for creating Aishell1Mix metadata files for speech separation provided by the SpeechBrain library.
Description
This script generates metadata CSV files that describe how to mix AISHELL-1 source utterances with WHAM! noise to create the Aishell1Mix speech separation dataset. It reads per-source metadata from both AISHELL-1 and WHAM! directories, randomly pairs speakers, randomizes loudness levels between -33 and -25 LUFS, and writes out CSV metadata files specifying the mixing parameters for each utterance combination. These metadata files are then consumed by the companion create_aishell1mix_from_metadata.py script to generate the actual audio mixtures.
Usage
Use this when generating metadata for the Aishell1Mix dataset prior to creating the actual mixed audio files for speech separation training with SpeechBrain.
Code Reference
Source Location
Signature
def create_aishell1mix_metadata(
aishell1_dir, aishell1_md_dir, wham_dir, wham_md_dir, md_dir, n_src
):
CLI Interface
python create_aishell1mix_metadata.py \
--aishell1_dir /path/to/aishell1/data_aishell/wav \
--aishell1_md_dir /path/to/metadata/aishell1 \
--wham_dir /path/to/wham_noise \
--wham_md_dir /path/to/metadata/wham_noise \
--metadata_outdir /path/to/aishell1mix/metadata \
--n_src 2
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| aishell1_dir | str | Yes | Path to AISHELL-1 root directory containing WAV files |
| aishell1_md_dir | str | Yes | Path to AISHELL-1 metadata directory |
| wham_dir | str | Yes | Path to WHAM! noise root directory |
| wham_md_dir | str | Yes | Path to WHAM! noise metadata directory |
| metadata_outdir | str | Yes | Where Aishell1Mix metadata files will be stored |
| n_src | int | No | Number of sources desired to create the mixture (default: 2) |
Outputs
| Name | Type | Description |
|---|---|---|
| Metadata CSV files | CSV Files | Metadata files describing mixing parameters for each utterance combination across train/dev/test splits |
Usage Examples
# Typically invoked via command line:
# python create_aishell1mix_metadata.py --aishell1_dir /data/aishell1/wav --wham_dir /data/wham_noise --metadata_outdir /data/aishell1mix/metadata
# Or programmatically:
from create_aishell1mix_metadata import create_aishell1mix_metadata
create_aishell1mix_metadata(
aishell1_dir="/data/aishell1/data_aishell/wav",
aishell1_md_dir="/data/metadata/aishell1",
wham_dir="/data/wham_noise",
wham_md_dir="/data/metadata/wham_noise",
md_dir="/data/aishell1mix/metadata",
n_src=2,
)
Related Pages
- Implementation:Speechbrain_Speechbrain_Create_Aishell1Mix_From_Metadata -- Companion script for generating audio from metadata
- Principle:Speechbrain_Speechbrain_Dataset_Specific_Data_Preparation