Implementation:Speechbrain Speechbrain Train GSC
| Knowledge Sources | |
|---|---|
| Domains | ASR, Training |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
Concrete tool for training a spoken command classifier on the Google Speech Commands v0.02 dataset provided by the SpeechBrain library.
Description
This recipe defines the SpeakerBrain class (subclass of sb.core.Brain) for keyword/command classification using the Google Speech Commands dataset. The pipeline computes features (supporting LEAF or traditional spectral features), normalizes them, extracts embeddings through an embedding model (supporting xvector or ECAPA-TDNN architectures), and passes them through a classifier. Waveform augmentation is applied during training. The model outputs command class probabilities.
Usage
Use this recipe to train a spoken command recognition classifier on the Google Speech Commands v0.02 dataset. Requires the corresponding hyperparameter YAML file (e.g., xvect.yaml).
Code Reference
Source Location
- Repository: SpeechBrain
- File: recipes/Google-speech-commands/train.py
Signature
class SpeakerBrain(sb.core.Brain):
def compute_forward(self, batch, stage):
"""Computation pipeline based on a encoder + command classifier."""
...
def compute_objectives(self, predictions, batch, stage):
"""Computes the loss using command-id as label."""
...
Import
# Run as recipe script
python recipes/Google-speech-commands/train.py hparams/xvect.yaml --data_folder /path/to/GSC
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| batch.sig | torch.Tensor | Yes | Input waveform signal |
| batch.command_encoded | torch.Tensor | Yes | Encoded command label |
Outputs
| Name | Type | Description |
|---|---|---|
| outputs | torch.Tensor | Class posterior probabilities over speech commands |
| lens | torch.Tensor | Relative signal lengths |
Usage Examples
python train.py hparams/xvect.yaml --data_folder /path/to/Google-speech-commands