Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Speechbrain Speechbrain Train PeoplesSpeech

From Leeroopedia


Knowledge Sources
Domains ASR, Training
Last Updated 2026-02-09 00:00 GMT

Overview

Concrete tool for training a Conformer ASR model on the People's Speech dataset provided by the SpeechBrain library.

Description

This recipe defines the ASR class (subclass of sb.core.Brain) for Conformer-based speech recognition on the People's Speech dataset (28,000 hours). The architecture uses a CNN frontend and a Conformer encoder with a Transformer decoder. Joint CTC/attention training with label smoothing is used. Feature augmentation is supported during training. Beam search decoding is applied at validation and test stages. BPE tokenization via SentencePiece is used for subword units.

Usage

Use this recipe to train a Conformer ASR model with CTC/attention joint decoding on the People's Speech dataset. Requires the corresponding hyperparameter YAML file and data preparation script.

Code Reference

Source Location

Signature

class ASR(sb.core.Brain):
    def compute_forward(self, batch, stage):
        """Forward computations from the waveform batches to the output probabilities."""
        ...
    def compute_objectives(self, predictions, batch, stage):
        """Computes the loss (CTC+NLL) given predictions and targets."""
        ...

Import

# Run as recipe script
python recipes/PeoplesSpeech/ASR/transformer/train.py hparams/conformer_large.yaml --data_folder /path/to/PeoplesSpeech

I/O Contract

Inputs

Name Type Required Description
batch.sig torch.Tensor Yes Input waveform signal
batch.tokens_bos torch.Tensor Yes Target token sequence with BOS prefix
batch.tokens_eos torch.Tensor Yes Target token sequence with EOS suffix
batch.tokens torch.Tensor Yes Target token sequence (for CTC)

Outputs

Name Type Description
p_ctc torch.Tensor CTC log-probabilities from encoder
p_seq torch.Tensor Seq2seq log-probabilities from Transformer decoder
wav_lens torch.Tensor Relative waveform lengths
hyps list Beam search hypotheses (at validation/test)

Usage Examples

python train.py hparams/conformer_large.yaml --data_folder /path/to/PeoplesSpeech

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment