Implementation:NVIDIA NeMo Aligner Hydra Config Loading
| Implementation Metadata | |
|---|---|
| Name | Hydra_Config_Loading |
| Type | Pattern Doc |
| Implements Principle | Hydra_Training_Configuration |
| Repository | NeMo Aligner |
| Files | examples/nlp/gpt/conf/gpt_sft.yaml, examples/nlp/gpt/conf/gpt_dpo.yaml |
| Lines | Full config files |
| Domains | Configuration_Management, MLOps |
| Last Updated | 2026-02-07 00:00 GMT |
Overview
Concrete pattern for declaratively configuring NeMo Aligner training pipelines using Hydra YAML configuration files.
Description
Every NeMo Aligner training script uses the @hydra_runner decorator to load hierarchical YAML configuration. The configuration files define all training parameters: trainer settings (GPUs, precision, epochs), model architecture (TP/PP sizes, batch sizes), optimizer (learning rate, scheduler), data (paths, formats, sequence lengths), and algorithm-specific parameters. CLI overrides merge with YAML defaults to produce a resolved DictConfig.
Usage
Use this pattern in every training entry point. Define a YAML config file in examples/nlp/gpt/conf/ and decorate the main function with @hydra_runner. Override parameters at launch time via CLI.
Code Reference
Source Location
- Repository: NeMo Aligner
- File:
examples/nlp/gpt/conf/gpt_sft.yaml(SFT config example) - File:
examples/nlp/gpt/conf/gpt_dpo.yaml(DPO config example) - Lines: Full config files
Interface
from nemo.core.config import hydra_runner
@hydra_runner(config_path="conf", config_name="gpt_sft")
def main(cfg) -> None:
# cfg is a resolved DictConfig with all parameters
# Access via cfg.trainer.max_steps, cfg.model.optim.lr, etc.
...
Key Config Sections
trainer:
num_nodes: 1
devices: 8
precision: bf16
sft/dpo/ppo:
max_epochs: 1
max_steps: -1
val_check_interval: 100
save_interval: 100
gradient_clip_val: 1.0
model:
micro_batch_size: 1
global_batch_size: 64
restore_from_path: ??? # Required: path to pretrained .nemo
data:
data_prefix: /path/to/data.jsonl
seq_length: 4096
optim:
name: fused_adam
lr: 1e-5
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| config_path | str | Yes | Relative path to the directory containing YAML config files (e.g., "conf")
|
| config_name | str | Yes | Name of the YAML config file without extension (e.g., "gpt_sft")
|
| CLI overrides | str | No | Command-line key=value overrides that merge with YAML defaults |
Outputs
| Name | Type | Description |
|---|---|---|
| cfg | DictConfig | Fully resolved configuration object with all parameters accessible via dot notation |
Usage Examples
Launching SFT Training with CLI Overrides
python examples/nlp/gpt/train_gpt_sft.py \
model.restore_from_path=/models/gpt.nemo \
model.data.train_ds.file_path=/data/train.jsonl \
model.optim.lr=5e-6 \
trainer.sft.max_steps=1000
Related Pages
- Principle:NVIDIA_NeMo_Aligner_Hydra_Training_Configuration
- Environment:NVIDIA_NeMo_Aligner_NeMo_Framework_GPU_Environment