Principle:Speechbrain Speechbrain HyperPyYAML Configuration
| Field | Value |
|---|---|
| Principle Name | HyperPyYAML_Configuration |
| Description | Declarative configuration system that instantiates Python objects directly from YAML using special tags |
| Domains | Configuration, Experiment_Management |
| Knowledge Sources | HyperPyYAML docs, SpeechBrain docs |
| Related Implementation | Implementation:Speechbrain_Speechbrain_Load_Hyperpyyaml |
Overview
HyperPyYAML is a configuration system that extends standard YAML with special tags that allow YAML files to directly instantiate Python objects, reference other YAML values, and call functions. This eliminates the gap between configuration and code by making the configuration file itself a declarative specification of the entire experiment setup -- including model architectures, optimizers, schedulers, loss functions, and data processing components.
Theoretical Foundation
In deep learning experiment management, there is a fundamental tension between flexibility (the ability to easily change any aspect of an experiment) and reproducibility (the ability to exactly recreate an experiment). Traditional approaches fall into two camps:
- Code-centric -- all configuration is hard-coded, making changes require code modifications
- Config-centric -- configuration files specify hyperparameters, but object construction still happens in code
HyperPyYAML takes the config-centric approach to its logical extreme: the configuration file itself specifies which classes to instantiate and how to construct them. This means that:
- The training script becomes a generic loop with minimal experiment-specific code
- The YAML file is the single source of truth for the entire experiment
- Changing the model architecture, optimizer, or any other component requires only editing the YAML file
- Command-line overrides allow systematic hyperparameter sweeps without code changes
Special Tags
HyperPyYAML extends YAML with several special tags:
!new: -- Object Instantiation
Creates a new instance of a Python class. Constructor arguments are specified as nested YAML keys.
enc: !new:speechbrain.nnet.containers.Sequential
input_shape: [null, null, 1024]
linear1: !name:speechbrain.nnet.linear.Linear
n_neurons: 1024
bias: True
!ref -- Cross-Reference
References another value defined elsewhere in the YAML file using angle-bracket syntax.
seed: 1234
output_folder: !ref results/experiment/<seed>
save_folder: !ref <output_folder>/save
!name: -- Class/Function Reference
Provides a reference to a class or function without instantiating it. This is useful for passing constructors (e.g., optimizer classes) or loss functions that will be called later.
model_opt_class: !name:torch.optim.Adadelta
lr: !ref <lr>
rho: 0.95
eps: 1.e-8
ctc_cost: !name:speechbrain.nnet.losses.ctc_loss
blank_index: !ref <blank_index>
!apply: -- Immediate Function Call
Calls a function immediately during YAML parsing and stores the result.
__set_seed: !apply:speechbrain.utils.seed_everything [!ref <seed>]
Override Mechanism
HyperPyYAML supports command-line overrides that allow modifying any value in the YAML file without editing the file itself. Overrides are passed as a string of YAML-formatted key-value pairs:
hparams_file, run_opts, overrides = sb.parse_arguments(sys.argv[1:])
with open(hparams_file) as fin:
hparams = load_hyperpyyaml(fin, overrides)
This enables systematic experiment management:
python train.py hparams/train.yaml --lr=0.001 --number_of_epochs=50
Design Benefits
- Reproducibility -- the YAML file captures the complete experiment specification; saving it alongside results ensures reproducibility
- Readability -- all hyperparameters and component definitions are in one human-readable file
- Composability -- components can reference each other via
!ref, making it easy to ensure consistency (e.g., the same learning rate value is used for both the scheduler and optimizer) - Separation of concerns -- the training script contains only the training logic, while all experiment-specific configuration lives in YAML
- Rapid experimentation -- switching between different model architectures, optimizers, or data augmentation strategies requires only changing the YAML file
Typical Usage Pattern in CTC ASR
In the CTC ASR training workflow, the YAML file defines:
- Data paths and language settings
- Training hyperparameters (learning rate, epochs, batch size)
- Model architecture (wav2vec2 encoder, DNN layers, CTC linear output)
- Optimizers and learning rate schedulers
- Data augmentation pipeline
- Checkpointing configuration
- Metric computers (WER, CER)
The training script loads this YAML, and the returned hparams dictionary contains fully instantiated Python objects ready for use.
Related Concepts
- Implementation:Speechbrain_Speechbrain_Load_Hyperpyyaml -- the concrete function for loading HyperPyYAML files
- The
!new:tag is central to SpeechBrain's approach of defining model architectures declaratively - Command-line overrides integrate with
speechbrain.parse_arguments()for experiment management