Implementation:Speechbrain Speechbrain Train Switchboard Seq2Seq
| Knowledge Sources | |
|---|---|
| Domains | ASR, Training |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
Concrete tool for training a sequence-to-sequence ASR model on the Switchboard dataset provided by the SpeechBrain library.
Description
This recipe defines the ASR class (subclass of sb.Brain) for attention-based sequence-to-sequence speech recognition on Switchboard conversational telephone speech. The encoder is a CRDNN model and the decoder is a standard GRU with beam search. The model is jointly trained with CTC and NLL losses, with CTC used only during early training epochs. The class accepts a custom normalize_fn for text normalization of Switchboard transcripts. BPE subword units are used as recognition tokens.
Usage
Use this recipe to train a seq2seq ASR model with CTC+attention on the Switchboard dataset (~300 hours). Requires the corresponding hyperparameter YAML file, a pre-trained tokenizer, and data preparation script.
Code Reference
Source Location
- Repository: SpeechBrain
- File: recipes/Switchboard/ASR/seq2seq/train.py
Signature
class ASR(sb.Brain):
def __init__(self, modules=None, opt_class=None, hparams=None,
run_opts=None, checkpointer=None, normalize_fn=None):
...
def compute_forward(self, batch, stage):
"""Forward computations from the waveform batches to the output probabilities."""
...
def compute_objectives(self, predictions, batch, stage):
"""Computes the loss (CTC+NLL) given predictions and targets."""
...
Import
# Run as recipe script
python recipes/Switchboard/ASR/seq2seq/train.py hparams/train_BPE_2000.yaml --data_folder /path/to/switchboard
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| batch.sig | torch.Tensor | Yes | Input waveform signal |
| batch.tokens_bos | torch.Tensor | Yes | Target token sequence with BOS prefix |
| batch.tokens_eos | torch.Tensor | Yes | Target token sequence with EOS suffix |
| batch.tokens | torch.Tensor | Yes | Target token sequence (for CTC) |
Outputs
| Name | Type | Description |
|---|---|---|
| p_ctc | torch.Tensor | CTC log-probabilities (during early training epochs) |
| p_seq | torch.Tensor | Seq2seq log-probabilities from attention decoder |
| wav_lens | torch.Tensor | Relative waveform lengths |
| p_tokens | list | Beam search hypotheses (at validation/test) |
Usage Examples
python train.py hparams/train_BPE_2000.yaml --data_folder /path/to/Switchboard