Principle:Speechbrain Speechbrain HyperPyYAML Configuration

Field	Value
Principle Name	HyperPyYAML_Configuration
Description	Declarative configuration system that instantiates Python objects directly from YAML using special tags
Domains	Configuration, Experiment_Management
Knowledge Sources	HyperPyYAML docs, SpeechBrain docs
Related Implementation	Implementation:Speechbrain_Speechbrain_Load_Hyperpyyaml

Overview

HyperPyYAML is a configuration system that extends standard YAML with special tags that allow YAML files to directly instantiate Python objects, reference other YAML values, and call functions. This eliminates the gap between configuration and code by making the configuration file itself a declarative specification of the entire experiment setup -- including model architectures, optimizers, schedulers, loss functions, and data processing components.

Theoretical Foundation

In deep learning experiment management, there is a fundamental tension between flexibility (the ability to easily change any aspect of an experiment) and reproducibility (the ability to exactly recreate an experiment). Traditional approaches fall into two camps:

Code-centric -- all configuration is hard-coded, making changes require code modifications
Config-centric -- configuration files specify hyperparameters, but object construction still happens in code

HyperPyYAML takes the config-centric approach to its logical extreme: the configuration file itself specifies which classes to instantiate and how to construct them. This means that:

The training script becomes a generic loop with minimal experiment-specific code
The YAML file is the single source of truth for the entire experiment
Changing the model architecture, optimizer, or any other component requires only editing the YAML file
Command-line overrides allow systematic hyperparameter sweeps without code changes

Special Tags

HyperPyYAML extends YAML with several special tags:

`!new:` -- Object Instantiation

Creates a new instance of a Python class. Constructor arguments are specified as nested YAML keys.

enc: !new:speechbrain.nnet.containers.Sequential
    input_shape: [null, null, 1024]
    linear1: !name:speechbrain.nnet.linear.Linear
        n_neurons: 1024
        bias: True

`!ref` -- Cross-Reference

References another value defined elsewhere in the YAML file using angle-bracket syntax.

seed: 1234
output_folder: !ref results/experiment/<seed>
save_folder: !ref <output_folder>/save

`!name:` -- Class/Function Reference

Provides a reference to a class or function without instantiating it. This is useful for passing constructors (e.g., optimizer classes) or loss functions that will be called later.

model_opt_class: !name:torch.optim.Adadelta
    lr: !ref <lr>
    rho: 0.95
    eps: 1.e-8

ctc_cost: !name:speechbrain.nnet.losses.ctc_loss
    blank_index: !ref <blank_index>

`!apply:` -- Immediate Function Call

Calls a function immediately during YAML parsing and stores the result.

__set_seed: !apply:speechbrain.utils.seed_everything [!ref <seed>]

Override Mechanism

HyperPyYAML supports command-line overrides that allow modifying any value in the YAML file without editing the file itself. Overrides are passed as a string of YAML-formatted key-value pairs:

hparams_file, run_opts, overrides = sb.parse_arguments(sys.argv[1:])
with open(hparams_file) as fin:
    hparams = load_hyperpyyaml(fin, overrides)

This enables systematic experiment management:

python train.py hparams/train.yaml --lr=0.001 --number_of_epochs=50

Design Benefits

Reproducibility -- the YAML file captures the complete experiment specification; saving it alongside results ensures reproducibility
Readability -- all hyperparameters and component definitions are in one human-readable file
Composability -- components can reference each other via !ref, making it easy to ensure consistency (e.g., the same learning rate value is used for both the scheduler and optimizer)
Separation of concerns -- the training script contains only the training logic, while all experiment-specific configuration lives in YAML
Rapid experimentation -- switching between different model architectures, optimizers, or data augmentation strategies requires only changing the YAML file

Typical Usage Pattern in CTC ASR

In the CTC ASR training workflow, the YAML file defines:

Data paths and language settings
Training hyperparameters (learning rate, epochs, batch size)
Model architecture (wav2vec2 encoder, DNN layers, CTC linear output)
Optimizers and learning rate schedulers
Data augmentation pipeline
Checkpointing configuration
Metric computers (WER, CER)

The training script loads this YAML, and the returned hparams dictionary contains fully instantiated Python objects ready for use.

Related Concepts

Implementation:Speechbrain_Speechbrain_Load_Hyperpyyaml -- the concrete function for loading HyperPyYAML files
The !new: tag is central to SpeechBrain's approach of defining model architectures declaratively
Command-line overrides integrate with speechbrain.parse_arguments() for experiment management

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment