Implementation:Speechbrain Speechbrain Hparams Switchboard Transformer

Knowledge Sources	SpeechBrain
Domains	ASR, Configuration
Last Updated	2026-02-09 00:00 GMT

Overview

Hyperparameter configuration for Transformer ASR training on the Switchboard dataset.

Description

HyperPyYAML configuration file that defines the model architecture, training schedule, and data processing pipeline for end-to-end ASR with a Transformer encoder-decoder on Switchboard conversational telephone speech. The model uses CTC + KLdiv (label smoothing) losses with unigram tokenization. A pre-trained Transformer language model and tokenizer must be provided. Training runs for 100 epochs with a global batch size requirement of at least 128 (batch_size * n_gpus * grad_accumulation_factor). The best model is averaged over the last 5 checkpoints.

Usage

Pass this YAML file as the first argument to the corresponding training script.

Code Reference

Source Location

Repository: SpeechBrain
File: recipes/Switchboard/ASR/transformer/hparams/transformer.yaml

Key Parameters

seed: 1312
number_of_epochs: 100
batch_size: 48
ctc_weight: 0.3
grad_accumulation_factor: 2
max_grad_norm: 5.0
loss_reduction: batchmean
sorting: random
avg_checkpoints: 5
lr_adam: 0.006

# Transcript normalization
normalize_words: True
max_utt: 300

I/O Contract

Inputs

Name	Type	Required	Description
--data_folder	str	Yes	Path to Switchboard dataset
--pretrained_lm_tokenizer_path	str	Yes	Path to pre-trained LM and tokenizer

Outputs

Name	Type	Description
Instantiated objects	Python objects	Model, optimizer, scheduler, etc.

Usage Examples

python train.py hparams/transformer.yaml --data_folder /path/to/Switchboard

Related Pages

Principle:Speechbrain_Speechbrain_HyperPyYAML_Configuration

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment