Implementation:Speechbrain Speechbrain Train SLURP NLU
| Knowledge Sources | |
|---|---|
| Domains | SLU, Training |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
Concrete tool for text-only Natural Language Understanding (NLU) training on the SLURP dataset provided by the SpeechBrain library.
Description
This recipe defines the SLU class (subclass of sb.Brain) for text-only NLU on the SLURP dataset. It takes golden ASR transcriptions as input and estimates semantics using a seq2seq architecture. The pipeline embeds transcript tokens, encodes them with an SLU encoder, then decodes with an attention-based decoder to produce semantic output sequences. Beam search is used for inference, and evaluation uses SLURP-specific metrics including scenario accuracy, action accuracy, and intent accuracy reported via jsonlines output.
Usage
Use this recipe to train a text-based NLU model on the SLURP dataset using ground truth transcriptions. Requires the SLURP data folder with pre-processed transcripts. Configure with hparams/train.yaml.
Code Reference
Source Location
- Repository: SpeechBrain
- File: recipes/SLURP/NLU/train.py
Signature
class SLU(sb.Brain):
def compute_forward(self, batch, stage):
...
def compute_objectives(self, predictions, batch, stage):
...
Import
python recipes/SLURP/NLU/train.py hparams/train.yaml --data_folder /path/to/SLURP
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| batch | PaddedBatch | Yes | Batch containing transcript_tokens, semantics_tokens_bos, and semantics_tokens_eos |
| stage | sb.Stage | Yes | TRAIN, VALID, or TEST |
Outputs
| Name | Type | Description |
|---|---|---|
| predictions | tuple | Log-softmax sequence probabilities, transcript lens, and decoded semantic tokens |
| loss | torch.Tensor | Sequence-level NLL loss on semantic tokens |
Usage Examples
python train.py hparams/train.yaml --data_folder /path/to/SLURP