Implementation:Speechbrain Speechbrain Train FluentSpeechCommands
| Knowledge Sources | |
|---|---|
| Domains | SLU, Training |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
Concrete tool for direct (speech to semantics) SLU training on the Fluent Speech Commands dataset provided by the SpeechBrain library.
Description
This recipe defines the SLU class (subclass of sb.Brain) for direct spoken language understanding on the Fluent Speech Commands dataset using ASR-based transfer learning. Input waveforms are encoded using a pre-trained ASR model (frozen during training), then passed through an SLU encoder and seq2seq decoder to predict semantic token sequences representing action, object, and location slots. Supports waveform augmentation with label replication, beam search decoding, and semantic accuracy evaluation.
Usage
Use this recipe to train a direct SLU model on the Fluent Speech Commands dataset. Requires the Fluent Speech Commands data folder and a pre-trained ASR model. Configure with hparams/train.yaml.
Code Reference
Source Location
- Repository: SpeechBrain
- File: recipes/fluent-speech-commands/direct/train.py
Signature
class SLU(sb.Brain):
def compute_forward(self, batch, stage):
...
def compute_objectives(self, predictions, batch, stage):
...
Import
python recipes/fluent-speech-commands/direct/train.py hparams/train.yaml --data_folder /path/to/fluent_speech_commands
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| batch | PaddedBatch | Yes | Batch containing sig (waveforms), tokens_bos, and tokens_eos (semantic tokens) |
| stage | sb.Stage | Yes | TRAIN, VALID, or TEST |
Outputs
| Name | Type | Description |
|---|---|---|
| predictions | tuple | Log-softmax sequence probabilities, wav_lens, and decoded semantic tokens |
| loss | torch.Tensor | Sequence-level NLL loss on semantic tokens |
Usage Examples
python train.py hparams/train.yaml --data_folder /path/to/fluent_speech_commands