Implementation:Speechbrain Speechbrain Prepare AudioMNIST

Knowledge Sources	SpeechBrain
Domains	Audio Classification, Data Preparation
Last Updated	2026-02-09 00:00 GMT

Overview

Concrete tool for preparing AudioMNIST dataset for audio digit classification provided by the SpeechBrain library.

Description

This script prepares the AudioMNIST dataset (spoken digit recordings) for training audio classification models. It can automatically download both the AudioMNIST audio data from GitHub and associated metadata from Dropbox. The script processes audio files, optionally resamples them from the source sample rate (48kHz by default), applies trimming of silent regions, and generates JSON manifest files for train/valid/test splits suitable for SpeechBrain training pipelines.

Usage

Use this when preparing the AudioMNIST dataset for spoken digit classification training with SpeechBrain recipes.

Code Reference

Source Location

Repository: SpeechBrain
File: recipes/AudioMNIST/audiomnist_prepare.py

Signature

def prepare_audiomnist(
    data_folder,
    save_folder,
    train_json,
    valid_json,
    test_json,
    metadata_folder=None,
    splits=DEFAULT_SPLITS,
    download=True,
    audiomnist_repo=None,
    metadata_repo=None,
    src_sample_rate=DEFAULT_SRC_SAMPLE_RATE,
    tgt_sample_rate=DEFAULT_TGT_SAMPLE_RATE,
    trim=True,
    trim_threshold=-30.0,
):

Import

from audiomnist_prepare import prepare_audiomnist

I/O Contract

Inputs

Name	Type	Required	Description
data_folder	str	Yes	Path to the folder where AudioMNIST data is stored (or will be downloaded)
save_folder	str	Yes	Where to write prepared JSON manifest files
train_json	str	Yes	Path for the output train JSON manifest
valid_json	str	Yes	Path for the output validation JSON manifest
test_json	str	Yes	Path for the output test JSON manifest
metadata_folder	str	No	Path to metadata folder (default: None, auto-detected)
splits	list	No	List of splits to prepare (default: ["train", "valid", "test"])
download	bool	No	Whether to automatically download the dataset (default: True)
audiomnist_repo	str	No	URL of the AudioMNIST repository (default: GitHub URL)
metadata_repo	str	No	URL of the metadata archive (default: Dropbox URL)
src_sample_rate	int	No	Source audio sample rate (default: 48000)
tgt_sample_rate	int	No	Target audio sample rate (default: 48000)
trim	bool	No	Whether to trim silent regions (default: True)
trim_threshold	float	No	Silence threshold in dB for trimming (default: -30.0)

Outputs

Name	Type	Description
train_json	JSON File	Train split manifest with utterance IDs, file paths, and labels
valid_json	JSON File	Validation split manifest
test_json	JSON File	Test split manifest

Usage Examples

from audiomnist_prepare import prepare_audiomnist

prepare_audiomnist(
    data_folder="/path/to/AudioMNIST",
    save_folder="/path/to/output",
    train_json="/path/to/output/train.json",
    valid_json="/path/to/output/valid.json",
    test_json="/path/to/output/test.json",
    download=True,
    trim=True,
)

Related Pages

Principle:Speechbrain_Speechbrain_Dataset_Specific_Data_Preparation

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment