Implementation:Speechbrain Speechbrain Hparams CommonVoice Conformer Transducer

Knowledge Sources	SpeechBrain
Domains	ASR, Configuration
Last Updated	2026-02-09 00:00 GMT

Overview

Hyperparameter configuration for Conformer Transducer ASR training on the CommonVoice dataset.

Description

HyperPyYAML configuration file that defines the model architecture, training schedule, and data processing pipeline for end-to-end ASR with a Conformer encoder and LSTM transducer decoder on CommonVoice data. The model uses Transducer + CTC + optional CE losses with BPE unigram tokenization. It supports Dynamic Chunk Training for streaming capability, with configurable chunk sizes and left context. Data augmentation has a configurable warmup period.

Usage

Pass this YAML file as the first argument to the corresponding training script.

Code Reference

Source Location

Repository: SpeechBrain
File: recipes/CommonVoice/ASR/transducer/hparams/conformer_transducer.yaml

Key Parameters

seed: 3407
number_of_epochs: 100
optimizer_step_limit: 90000
warmup_steps: 25000
augment_warmup: 5000
lr: 0.0008
weight_decay: 0.01
ctc_weight: 0.4
ce_weight: 0.0
precision: fp32

# Feature parameters
sample_rate: 16000
n_fft: 512
n_mels: 80

# Streaming & Dynamic Chunk Training
streaming: True
chunkwise_prob: 0.6
chunk_size_min: 8
chunk_size_max: 32

I/O Contract

Inputs

Name	Type	Required	Description
--data_folder	str	Yes	Path to CommonVoice dataset

Outputs

Name	Type	Description
Instantiated objects	Python objects	Model, optimizer, scheduler, etc.

Usage Examples

python train.py hparams/conformer_transducer.yaml --data_folder /path/to/CommonVoice

Related Pages

Principle:Speechbrain_Speechbrain_HyperPyYAML_Configuration

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment