Implementation:Speechbrain Speechbrain Prepare Librimix
| Field | Value |
|---|---|
| Implementation Name | Prepare_Librimix |
| API | prepare_librimix(datapath, savepath, n_spks=2, skip_prep=False, librimix_addnoise=False, fs=8000)
|
| Source | recipes/LibriMix/prepare_data.py:L12-51
|
| Import | from prepare_data import prepare_librimix
|
| Type | API Doc |
| Related Principle | Principle:Speechbrain_Speechbrain_Mixture_Dataset_Preparation |
Purpose
The prepare_librimix function generates CSV manifest files that map LibriMix mixture audio files to their constituent clean source files. These manifests are required by SpeechBrain's DynamicItemDataset to load training, validation, and test data during the speech separation workflow.
Function Signature
def prepare_librimix(
datapath,
savepath,
n_spks=2,
skip_prep=False,
librimix_addnoise=False,
fs=8000,
):
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
datapath |
str | (required) | Path to the LibriMix dataset root directory (e.g., /data/Libri2Mix). Must contain "Libri" in the path.
|
savepath |
str | (required) | Directory where output CSV files will be written |
n_spks |
int | 2 | Number of speakers in the mixture (2 or 3). Must match the dataset: 2 requires "Libri2Mix" in datapath, 3 requires "Libri3Mix". |
skip_prep |
bool | False | If True, skip data preparation entirely (useful when CSV files already exist) |
librimix_addnoise |
bool | False | If True, use noisy mixtures (mix_both/) instead of clean mixtures (mix_clean/)
|
fs |
int | 8000 | Sample rate in Hz. Determines the version subdirectory (wav8k/min/ by default).
|
Inputs
The function expects the LibriMix dataset to be organized in the following directory structure:
{datapath}/
wav8k/min/
train-360/
mix_clean/ # Clean 2-speaker mixtures (used when librimix_addnoise=False)
mix_both/ # Noisy mixtures with WHAM! noise (used when librimix_addnoise=True)
s1/ # First speaker source files
s2/ # Second speaker source files
s3/ # Third speaker source (Libri3Mix only)
noise/ # WHAM! noise files
dev/
...
test/
...
Outputs
The function produces CSV manifest files in the savepath directory:
For 2-speaker mixtures (n_spks=2):
libri2mix_train-360.csvlibri2mix_dev.csvlibri2mix_test.csv
For 3-speaker mixtures (n_spks=3):
libri3mix_train-360.csvlibri3mix_dev.csvlibri3mix_test.csv
Each CSV file contains the following columns:
| Column | Description |
|---|---|
ID |
Integer index of the example |
duration |
Duration placeholder (set to 1.0) |
mix_wav |
Absolute path to the mixture waveform |
mix_wav_format |
Audio format string ("wav") |
mix_wav_opts |
Additional options (None) |
s1_wav |
Absolute path to first speaker source |
s1_wav_format |
Audio format string ("wav") |
s1_wav_opts |
Additional options (None) |
s2_wav |
Absolute path to second speaker source |
s2_wav_format |
Audio format string ("wav") |
s2_wav_opts |
Additional options (None) |
s3_wav |
(3-speaker only) Absolute path to third speaker source |
noise_wav |
Absolute path to WHAM! noise file |
Internal Dispatch
The function dispatches to one of two internal helpers based on the number of speakers:
if n_spks == 2:
create_libri2mix_csv(datapath, savepath, addnoise=librimix_addnoise)
elif n_spks == 3:
create_libri3mix_csv(datapath, savepath, addnoise=librimix_addnoise)
Both helpers iterate over the three dataset splits (train-360, dev, test) and generate one CSV per split.
Usage Example
from prepare_data import prepare_librimix
# Prepare 2-speaker clean LibriMix CSV manifests
prepare_librimix(
datapath="/data/Libri2Mix",
savepath="/output/save",
n_spks=2,
skip_prep=False,
librimix_addnoise=False,
fs=8000,
)
# Prepare 3-speaker noisy LibriMix CSV manifests
prepare_librimix(
datapath="/data/Libri3Mix",
savepath="/output/save",
n_spks=3,
skip_prep=False,
librimix_addnoise=True,
fs=8000,
)
Integration with the Training Script
In the main training script, prepare_librimix is called via run_on_main to ensure it executes only on the primary process during distributed training:
from speechbrain.utils.distributed import run_on_main
from prepare_data import prepare_librimix
run_on_main(
prepare_librimix,
kwargs={
"datapath": hparams["data_folder"],
"savepath": hparams["save_folder"],
"n_spks": hparams["num_spks"],
"skip_prep": hparams["skip_prep"],
"librimix_addnoise": hparams["use_wham_noise"],
"fs": hparams["sample_rate"],
},
)
Key Implementation Details
- The function validates that the
datapathcontains "Libri" and that the number of speakers is consistent with the dataset variant (Libri2Mix vs Libri3Mix) - The
skip_prepflag allows bypassing preparation when CSV files already exist from a previous run - The default version subdirectory is
wav8k/min/, corresponding to 8 kHz minimum-length mixtures - Duration is set to a placeholder value of 1.0 for all entries; actual duration is determined at audio loading time
Source File
recipes/LibriMix/prepare_data.py