Implementation:Speechbrain Speechbrain Prepare AudioMNIST
| Knowledge Sources | |
|---|---|
| Domains | Audio Classification, Data Preparation |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
Concrete tool for preparing AudioMNIST dataset for audio digit classification provided by the SpeechBrain library.
Description
This script prepares the AudioMNIST dataset (spoken digit recordings) for training audio classification models. It can automatically download both the AudioMNIST audio data from GitHub and associated metadata from Dropbox. The script processes audio files, optionally resamples them from the source sample rate (48kHz by default), applies trimming of silent regions, and generates JSON manifest files for train/valid/test splits suitable for SpeechBrain training pipelines.
Usage
Use this when preparing the AudioMNIST dataset for spoken digit classification training with SpeechBrain recipes.
Code Reference
Source Location
- Repository: SpeechBrain
- File: recipes/AudioMNIST/audiomnist_prepare.py
Signature
def prepare_audiomnist(
data_folder,
save_folder,
train_json,
valid_json,
test_json,
metadata_folder=None,
splits=DEFAULT_SPLITS,
download=True,
audiomnist_repo=None,
metadata_repo=None,
src_sample_rate=DEFAULT_SRC_SAMPLE_RATE,
tgt_sample_rate=DEFAULT_TGT_SAMPLE_RATE,
trim=True,
trim_threshold=-30.0,
):
Import
from audiomnist_prepare import prepare_audiomnist
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| data_folder | str | Yes | Path to the folder where AudioMNIST data is stored (or will be downloaded) |
| save_folder | str | Yes | Where to write prepared JSON manifest files |
| train_json | str | Yes | Path for the output train JSON manifest |
| valid_json | str | Yes | Path for the output validation JSON manifest |
| test_json | str | Yes | Path for the output test JSON manifest |
| metadata_folder | str | No | Path to metadata folder (default: None, auto-detected) |
| splits | list | No | List of splits to prepare (default: ["train", "valid", "test"]) |
| download | bool | No | Whether to automatically download the dataset (default: True) |
| audiomnist_repo | str | No | URL of the AudioMNIST repository (default: GitHub URL) |
| metadata_repo | str | No | URL of the metadata archive (default: Dropbox URL) |
| src_sample_rate | int | No | Source audio sample rate (default: 48000) |
| tgt_sample_rate | int | No | Target audio sample rate (default: 48000) |
| trim | bool | No | Whether to trim silent regions (default: True) |
| trim_threshold | float | No | Silence threshold in dB for trimming (default: -30.0) |
Outputs
| Name | Type | Description |
|---|---|---|
| train_json | JSON File | Train split manifest with utterance IDs, file paths, and labels |
| valid_json | JSON File | Validation split manifest |
| test_json | JSON File | Test split manifest |
Usage Examples
from audiomnist_prepare import prepare_audiomnist
prepare_audiomnist(
data_folder="/path/to/AudioMNIST",
save_folder="/path/to/output",
train_json="/path/to/output/train.json",
valid_json="/path/to/output/valid.json",
test_json="/path/to/output/test.json",
download=True,
trim=True,
)