Implementation:Speechbrain Speechbrain Train AISHELL1 Transformer
| Knowledge Sources | |
|---|---|
| Domains | ASR, Training |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
Concrete tool for training a Transformer ASR model on the AISHELL-1 dataset provided by the SpeechBrain library.
Description
This recipe defines the ASR class (subclass of sb.core.Brain) for Transformer-based speech recognition on Mandarin Chinese. The architecture uses a CNN frontend followed by a Transformer encoder-decoder. Both CTC and KLdiv (label smoothing) losses are used for joint CTC/attention training. Beam search decoding with CTC/attention joint scoring is applied at validation and test stages.
Usage
Use this recipe to train a Transformer ASR model with CTC/attention joint decoding on the AISHELL-1 Mandarin Chinese dataset. Requires the corresponding hyperparameter YAML file and data preparation script.
Code Reference
Source Location
- Repository: SpeechBrain
- File: recipes/AISHELL-1/ASR/transformer/train.py
Signature
class ASR(sb.core.Brain):
def compute_forward(self, batch, stage):
"""Forward computations from the waveform batches to the output probabilities."""
...
def compute_objectives(self, predictions, batch, stage):
"""Computes the loss (CTC+NLL) given predictions and targets."""
...
Import
# Run as recipe script
python recipes/AISHELL-1/ASR/transformer/train.py hparams/train_ASR_transformer.yaml --data_folder /path/to/aishell
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| batch.sig | torch.Tensor | Yes | Input waveform signal |
| batch.tokens_bos | torch.Tensor | Yes | Target token sequence with BOS prefix |
| batch.tokens_eos | torch.Tensor | Yes | Target token sequence with EOS suffix |
| batch.tokens | torch.Tensor | Yes | Target token sequence (for CTC) |
Outputs
| Name | Type | Description |
|---|---|---|
| p_ctc | torch.Tensor | CTC log-probabilities from encoder |
| p_seq | torch.Tensor | Seq2seq log-probabilities from Transformer decoder |
| wav_lens | torch.Tensor | Relative waveform lengths |
| hyps | list | Beam search hypotheses (at validation/test) |
Usage Examples
python train.py hparams/train_ASR_transformer.yaml --data_folder /path/to/AISHELL-1