Implementation:Speechbrain Speechbrain Train KsponSpeech

Knowledge Sources	SpeechBrain
Domains	ASR, Training
Last Updated	2026-02-09 00:00 GMT

Overview

Concrete tool for training a Transformer ASR model on the KsponSpeech dataset provided by the SpeechBrain library.

Description

This recipe defines the ASR class (subclass of sb.core.Brain) for Transformer/Conformer-based speech recognition on the KsponSpeech Korean dataset (965.2 hours). The architecture uses a CNN frontend, a Transformer or Conformer encoder-decoder, with joint CTC/attention training and label smoothing. Feature augmentation is supported during training. Beam search decoding coupled with a Transformer language model is used at evaluation. The best model is averaged over the last 5 checkpoints.

Usage

Use this recipe to train a Transformer or Conformer ASR model with CTC/attention joint decoding on the KsponSpeech Korean dataset. Requires the corresponding hyperparameter YAML file and data preparation script.

Code Reference

Source Location

Repository: SpeechBrain
File: recipes/KsponSpeech/ASR/transformer/train.py

Signature

class ASR(sb.core.Brain):
    def compute_forward(self, batch, stage):
        """Forward computations from the waveform batches to the output probabilities."""
        ...
    def compute_objectives(self, predictions, batch, stage):
        """Computes the loss (CTC+NLL) given predictions and targets."""
        ...

Import

# Run as recipe script
python recipes/KsponSpeech/ASR/transformer/train.py hparams/conformer_medium.yaml --data_folder /path/to/KsponSpeech

I/O Contract

Inputs

Name	Type	Required	Description
batch.sig	torch.Tensor	Yes	Input waveform signal
batch.tokens_bos	torch.Tensor	Yes	Target token sequence with BOS prefix
batch.tokens_eos	torch.Tensor	Yes	Target token sequence with EOS suffix
batch.tokens	torch.Tensor	Yes	Target token sequence (for CTC)

Outputs

Name	Type	Description
p_ctc	torch.Tensor	CTC log-probabilities from encoder
p_seq	torch.Tensor	Seq2seq log-probabilities from Transformer decoder
wav_lens	torch.Tensor	Relative waveform lengths
hyps	list	Beam search hypotheses (at validation/test)

Usage Examples

python train.py hparams/conformer_medium.yaml --data_folder /path/to/KsponSpeech

Related Pages

Principle:Speechbrain_Speechbrain_Transformer_ASR_Training

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment