Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Speechbrain Speechbrain Prepare UrbanSound8k

From Leeroopedia


Knowledge Sources
Domains Sound_Classification, Data_Preparation
Last Updated 2026-02-09 00:00 GMT

Overview

Concrete tool for preparing the UrbanSound8K dataset for sound classification tasks provided by the SpeechBrain library.

Description

This script creates JSON data manifest files from the UrbanSound8K dataset for use in SpeechBrain sound classification recipes. It respects the dataset's predefined 10-fold cross-validation structure, allowing users to specify which folds to use for training, validation, and testing. The script processes the UrbanSound8K metadata CSV and audio files to generate standardized JSON manifests with audio paths, class labels, and fold information.

Usage

Use this script when preparing data for environmental sound classification experiments on UrbanSound8K. Follow the dataset authors' guidelines: always use the predefined 10-fold cross-validation splits and never reshuffle the data.

Code Reference

Source Location

Signature

def prepare_urban_sound_8k(
    data_folder,
    audio_data_folder,
    save_json_train,
    save_json_valid,
    save_json_test,
    train_fold_nums=[1, 2, 3, 4, 5, 6, 7, 8],
    valid_fold_nums=[9],
    test_fold_nums=[10],
    skip_manifest_creation=False,
):

Import

from urbansound8k_prepare import prepare_urban_sound_8k

I/O Contract

Inputs

Name Type Required Description
data_folder str Yes Path to the folder where UrbanSound8K dataset metadata is stored
audio_data_folder str Yes Path to the folder where UrbanSound8K audio files are stored
save_json_train str Yes Path where the train data specification JSON file will be saved
save_json_valid str Yes Path where the validation data specification JSON file will be saved
save_json_test str Yes Path where the test data specification JSON file will be saved
train_fold_nums list No List of integers [1-10] defining folds for training (default: [1-8])
valid_fold_nums list No List of integers [1-10] defining folds for validation (default: [9])
test_fold_nums list No List of integers [1-10] defining folds for testing (default: [10])
skip_manifest_creation bool No If True, skips manifest creation (default: False)

Outputs

Name Type Description
train.json JSON file Training manifest with audio paths and sound class labels
valid.json JSON file Validation manifest
test.json JSON file Test manifest

Usage Examples

from urbansound8k_prepare import prepare_urban_sound_8k

# Standard 10-fold cross-validation setup (fold 10 as test)
prepare_urban_sound_8k(
    data_folder="/path/to/UrbanSound8k",
    audio_data_folder="/path/to/UrbanSound8k/audio",
    save_json_train="output/train.json",
    save_json_valid="output/valid.json",
    save_json_test="output/test.json",
    train_fold_nums=[1, 2, 3, 4, 5, 6, 7, 8],
    valid_fold_nums=[9],
    test_fold_nums=[10],
)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment