Implementation:Speechbrain Speechbrain Train SLURP Direct
| Knowledge Sources | |
|---|---|
| Domains | SLU, Training |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
Concrete tool for direct (speech to semantics) SLU training with ASR-based transfer learning on the SLURP dataset provided by the SpeechBrain library.
Description
This recipe defines the SLU class (subclass of sb.Brain) for direct spoken language understanding on the SLURP dataset. Input waveforms are encoded into features using a pre-trained ASR model (from LibriSpeech), then the features are passed through an SLU encoder and a seq2seq decoder to map directly to semantic outputs. The ASR encoder is frozen during training. Supports waveform augmentation with label replication for augmented samples, beam search decoding, and SLURP-specific evaluation metrics (scenario, action, intent accuracy).
Usage
Use this recipe to train a direct SLU model on the SLURP dataset using ASR-based transfer learning. Requires the SLURP data folder and a pre-trained ASR model checkpoint. Configure with hparams/train.yaml.
Code Reference
Source Location
- Repository: SpeechBrain
- File: recipes/SLURP/direct/train.py
Signature
class SLU(sb.Brain):
def compute_forward(self, batch, stage):
...
def compute_objectives(self, predictions, batch, stage):
...
Import
python recipes/SLURP/direct/train.py hparams/train.yaml --data_folder /path/to/SLURP
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| batch | PaddedBatch | Yes | Batch containing sig (waveforms), tokens_bos, and tokens_eos (semantic tokens) |
| stage | sb.Stage | Yes | TRAIN, VALID, or TEST |
Outputs
| Name | Type | Description |
|---|---|---|
| predictions | tuple | Log-softmax sequence probabilities, wav_lens, and decoded semantic tokens |
| loss | torch.Tensor | Sequence-level NLL loss on semantic tokens |
Usage Examples
python train.py hparams/train.yaml --data_folder /path/to/SLURP