Implementation:Speechbrain Speechbrain Hparams AISHELL1 Transformer
| Knowledge Sources | |
|---|---|
| Domains | ASR, Configuration |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
Hyperparameter configuration for Transformer ASR training on the AISHELL-1 dataset.
Description
HyperPyYAML configuration file that defines the model architecture, training schedule, and data processing pipeline for end-to-end ASR with a Transformer encoder-decoder on AISHELL-1 Mandarin Chinese data. The model uses CTC + KLdiv (label smoothing) losses with BPE unigram tokenization. It supports waveform and noise augmentation, dynamic batching, and configurable precision (fp16/bf16/fp32).
Usage
Pass this YAML file as the first argument to the corresponding training script.
Code Reference
Source Location
Key Parameters
seed: 8886
number_of_epochs: 50
batch_size: 8
ctc_weight: 0.3
grad_accumulation_factor: 4
loss_reduction: 'batchmean'
sorting: random
avg_checkpoints: 10
precision: fp32
# Feature parameters
sample_rate: 16000
n_fft: 400
n_mels: 80
# Stages related parameters
stage_one_epochs: 40
lr_adam: 1.0
lr_sgd: 0.000025
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| --data_folder | str | Yes | Path to AISHELL-1 dataset |
Outputs
| Name | Type | Description |
|---|---|---|
| Instantiated objects | Python objects | Model, optimizer, scheduler, etc. |
Usage Examples
python train.py hparams/train_ASR_transformer.yaml --data_folder /path/to/aishell