Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Speechbrain Speechbrain Train GigaSpeech Transducer

From Leeroopedia


Knowledge Sources
Domains ASR, Training
Last Updated 2026-02-09 00:00 GMT

Overview

Concrete tool for training a Transducer ASR model on the GigaSpeech dataset provided by the SpeechBrain library.

Description

This recipe defines the ASR class (subclass of sb.Brain) for Conformer-Transducer speech recognition on GigaSpeech. The architecture uses a Conformer encoder with an LSTM-based transducer decoder and a joint network. Dynamic Chunk Training (streaming support) is optionally enabled. The model is jointly trained with transducer loss, CTC, and optional cross-entropy losses. Beam search decoding coupled with an RNN language model is supported. BPE tokenization via SentencePiece is used for subword units.

Usage

Use this recipe to train a Conformer-Transducer ASR model on the GigaSpeech dataset (supporting XS through XL splits). Requires the corresponding hyperparameter YAML file and data preparation script.

Code Reference

Source Location

Signature

class ASR(sb.Brain):
    def compute_forward(self, batch, stage):
        """Forward computations from the waveform batches to the output probabilities."""
        ...
    def compute_objectives(self, predictions, batch, stage):
        """Computes the loss given predictions and targets."""
        ...

Import

# Run as recipe script
python recipes/GigaSpeech/ASR/transducer/train.py hparams/conformer_transducer.yaml --data_folder /path/to/GigaSpeech

I/O Contract

Inputs

Name Type Required Description
batch.sig torch.Tensor Yes Input waveform signal
batch.tokens_bos torch.Tensor Yes Target token sequence with BOS prefix
batch.tokens torch.Tensor Yes Target token sequence

Outputs

Name Type Description
p_transducer torch.Tensor Transducer joint network log-probabilities
p_ctc torch.Tensor CTC log-probabilities from encoder (optional)
p_ce torch.Tensor Cross-entropy predictions from decoder (optional)
wav_lens torch.Tensor Relative waveform lengths
hyps list Beam search hypotheses (at validation/test)

Usage Examples

python train.py hparams/conformer_transducer.yaml --data_folder /path/to/GigaSpeech

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment