Implementation:Speechbrain Speechbrain Train UrbanSound8k
| Knowledge Sources | |
|---|---|
| Domains | Sound_Classification, Training |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
Concrete tool for training sound class embeddings on the UrbanSound8k dataset provided by the SpeechBrain library.
Description
This recipe defines the UrbanSound8kBrain class (subclass of sb.core.Brain) for training sound class embedding models (x-vectors or ECAPA-TDNN) on the UrbanSound8k dataset. The pipeline extracts spectral features, optionally applies amplitude-to-dB conversion and mean-variance normalization, then feeds the features through an embedding model and classifier. Supports waveform augmentation with label replication for augmented samples. Evaluation includes per-fold accuracy tracking and confusion matrix generation.
Usage
Use this recipe to train a sound classifier on the UrbanSound8k dataset using either x-vector or ECAPA-TDNN architectures. Requires the UrbanSound8k data folder. Configure with hparams/train_ecapa_tdnn.yaml or hparams/train.yaml.
Code Reference
Source Location
- Repository: SpeechBrain
- File: recipes/UrbanSound8k/SoundClassification/train.py
Signature
class UrbanSound8kBrain(sb.core.Brain):
def compute_forward(self, batch, stage):
...
def compute_objectives(self, predictions, batch, stage):
...
Import
python recipes/UrbanSound8k/SoundClassification/train.py hparams/train_ecapa_tdnn.yaml --data_folder /path/to/UrbanSound8K
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| batch | PaddedBatch | Yes | Batch containing sig (waveforms) and class_string_encoded (class labels) |
| stage | sb.Stage | Yes | TRAIN, VALID, or TEST |
Outputs
| Name | Type | Description |
|---|---|---|
| predictions | tuple | Classifier output logits and lens |
| loss | torch.Tensor | NLL classification loss |
Usage Examples
python train.py hparams/train_ecapa_tdnn.yaml --data_folder /path/to/UrbanSound8K