Implementation:NVIDIA NeMo Aligner Hydra Config Loading

Implementation Metadata
Name	Hydra_Config_Loading
Type	Pattern Doc
Implements Principle	Hydra_Training_Configuration
Repository	NeMo Aligner
Files	examples/nlp/gpt/conf/gpt_sft.yaml, examples/nlp/gpt/conf/gpt_dpo.yaml
Lines	Full config files
Domains	Configuration_Management, MLOps
Last Updated	2026-02-07 00:00 GMT

Overview

Concrete pattern for declaratively configuring NeMo Aligner training pipelines using Hydra YAML configuration files.

Description

Every NeMo Aligner training script uses the @hydra_runner decorator to load hierarchical YAML configuration. The configuration files define all training parameters: trainer settings (GPUs, precision, epochs), model architecture (TP/PP sizes, batch sizes), optimizer (learning rate, scheduler), data (paths, formats, sequence lengths), and algorithm-specific parameters. CLI overrides merge with YAML defaults to produce a resolved DictConfig.

Usage

Use this pattern in every training entry point. Define a YAML config file in examples/nlp/gpt/conf/ and decorate the main function with @hydra_runner. Override parameters at launch time via CLI.

Code Reference

Source Location

Repository: NeMo Aligner
File: examples/nlp/gpt/conf/gpt_sft.yaml (SFT config example)
File: examples/nlp/gpt/conf/gpt_dpo.yaml (DPO config example)
Lines: Full config files

Interface

from nemo.core.config import hydra_runner

@hydra_runner(config_path="conf", config_name="gpt_sft")
def main(cfg) -> None:
    # cfg is a resolved DictConfig with all parameters
    # Access via cfg.trainer.max_steps, cfg.model.optim.lr, etc.
    ...

Key Config Sections

trainer:
  num_nodes: 1
  devices: 8
  precision: bf16
  sft/dpo/ppo:
    max_epochs: 1
    max_steps: -1
    val_check_interval: 100
    save_interval: 100
    gradient_clip_val: 1.0

model:
  micro_batch_size: 1
  global_batch_size: 64
  restore_from_path: ???  # Required: path to pretrained .nemo
  data:
    data_prefix: /path/to/data.jsonl
    seq_length: 4096
  optim:
    name: fused_adam
    lr: 1e-5

I/O Contract

Inputs

Name	Type	Required	Description
config_path	str	Yes	Relative path to the directory containing YAML config files (e.g., `"conf"`)
config_name	str	Yes	Name of the YAML config file without extension (e.g., `"gpt_sft"`)
CLI overrides	str	No	Command-line key=value overrides that merge with YAML defaults

Outputs

Name	Type	Description
cfg	DictConfig	Fully resolved configuration object with all parameters accessible via dot notation

Usage Examples

Launching SFT Training with CLI Overrides

python examples/nlp/gpt/train_gpt_sft.py \
    model.restore_from_path=/models/gpt.nemo \
    model.data.train_ds.file_path=/data/train.jsonl \
    model.optim.lr=5e-6 \
    trainer.sft.max_steps=1000

Related Pages

Knowledge Sources

Configuration_Management | MLOps

2026-02-07 00:00 GMT

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment