Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Speechbrain Speechbrain Prepare Librimix

From Leeroopedia


Field Value
Implementation Name Prepare_Librimix
API prepare_librimix(datapath, savepath, n_spks=2, skip_prep=False, librimix_addnoise=False, fs=8000)
Source recipes/LibriMix/prepare_data.py:L12-51
Import from prepare_data import prepare_librimix
Type API Doc
Related Principle Principle:Speechbrain_Speechbrain_Mixture_Dataset_Preparation

Purpose

The prepare_librimix function generates CSV manifest files that map LibriMix mixture audio files to their constituent clean source files. These manifests are required by SpeechBrain's DynamicItemDataset to load training, validation, and test data during the speech separation workflow.

Function Signature

def prepare_librimix(
    datapath,
    savepath,
    n_spks=2,
    skip_prep=False,
    librimix_addnoise=False,
    fs=8000,
):

Parameters

Parameter Type Default Description
datapath str (required) Path to the LibriMix dataset root directory (e.g., /data/Libri2Mix). Must contain "Libri" in the path.
savepath str (required) Directory where output CSV files will be written
n_spks int 2 Number of speakers in the mixture (2 or 3). Must match the dataset: 2 requires "Libri2Mix" in datapath, 3 requires "Libri3Mix".
skip_prep bool False If True, skip data preparation entirely (useful when CSV files already exist)
librimix_addnoise bool False If True, use noisy mixtures (mix_both/) instead of clean mixtures (mix_clean/)
fs int 8000 Sample rate in Hz. Determines the version subdirectory (wav8k/min/ by default).

Inputs

The function expects the LibriMix dataset to be organized in the following directory structure:

{datapath}/
  wav8k/min/
    train-360/
      mix_clean/   # Clean 2-speaker mixtures (used when librimix_addnoise=False)
      mix_both/    # Noisy mixtures with WHAM! noise (used when librimix_addnoise=True)
      s1/          # First speaker source files
      s2/          # Second speaker source files
      s3/          # Third speaker source (Libri3Mix only)
      noise/       # WHAM! noise files
    dev/
      ...
    test/
      ...

Outputs

The function produces CSV manifest files in the savepath directory:

For 2-speaker mixtures (n_spks=2):

  • libri2mix_train-360.csv
  • libri2mix_dev.csv
  • libri2mix_test.csv

For 3-speaker mixtures (n_spks=3):

  • libri3mix_train-360.csv
  • libri3mix_dev.csv
  • libri3mix_test.csv

Each CSV file contains the following columns:

Column Description
ID Integer index of the example
duration Duration placeholder (set to 1.0)
mix_wav Absolute path to the mixture waveform
mix_wav_format Audio format string ("wav")
mix_wav_opts Additional options (None)
s1_wav Absolute path to first speaker source
s1_wav_format Audio format string ("wav")
s1_wav_opts Additional options (None)
s2_wav Absolute path to second speaker source
s2_wav_format Audio format string ("wav")
s2_wav_opts Additional options (None)
s3_wav (3-speaker only) Absolute path to third speaker source
noise_wav Absolute path to WHAM! noise file

Internal Dispatch

The function dispatches to one of two internal helpers based on the number of speakers:

if n_spks == 2:
    create_libri2mix_csv(datapath, savepath, addnoise=librimix_addnoise)
elif n_spks == 3:
    create_libri3mix_csv(datapath, savepath, addnoise=librimix_addnoise)

Both helpers iterate over the three dataset splits (train-360, dev, test) and generate one CSV per split.

Usage Example

from prepare_data import prepare_librimix

# Prepare 2-speaker clean LibriMix CSV manifests
prepare_librimix(
    datapath="/data/Libri2Mix",
    savepath="/output/save",
    n_spks=2,
    skip_prep=False,
    librimix_addnoise=False,
    fs=8000,
)

# Prepare 3-speaker noisy LibriMix CSV manifests
prepare_librimix(
    datapath="/data/Libri3Mix",
    savepath="/output/save",
    n_spks=3,
    skip_prep=False,
    librimix_addnoise=True,
    fs=8000,
)

Integration with the Training Script

In the main training script, prepare_librimix is called via run_on_main to ensure it executes only on the primary process during distributed training:

from speechbrain.utils.distributed import run_on_main
from prepare_data import prepare_librimix

run_on_main(
    prepare_librimix,
    kwargs={
        "datapath": hparams["data_folder"],
        "savepath": hparams["save_folder"],
        "n_spks": hparams["num_spks"],
        "skip_prep": hparams["skip_prep"],
        "librimix_addnoise": hparams["use_wham_noise"],
        "fs": hparams["sample_rate"],
    },
)

Key Implementation Details

  • The function validates that the datapath contains "Libri" and that the number of speakers is consistent with the dataset variant (Libri2Mix vs Libri3Mix)
  • The skip_prep flag allows bypassing preparation when CSV files already exist from a previous run
  • The default version subdirectory is wav8k/min/, corresponding to 8 kHz minimum-length mixtures
  • Duration is set to a placeholder value of 1.0 for all entries; actual duration is determined at audio loading time

Source File

recipes/LibriMix/prepare_data.py

See Also

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment