Implementation:NVIDIA NeMo Aligner Reward Model Class Registry

Implementation Details
Name	Reward_Model_Class_Registry
Type	API Doc
Implements	Reward_Model_Architecture_Selection
Repository	NeMo Aligner
Primary File	nemo_aligner/models/nlp/gpt/reward_model_classes.py
Domains	NLP, Model_Architecture
Last Updated	2026-02-07 00:00 GMT

Overview

Concrete tool for selecting reward model architecture types via an enumeration registry provided by the NeMo Aligner models module.

Description

The RewardModelType enum and REWARD_MODEL_CLASS_DICT dictionary provide a simple registry pattern for selecting between binary ranking and regression reward model architectures. The enum defines the valid type strings, and the dict maps each type to its corresponding model class (MegatronGPTRewardModel or MegatronGPTRegressionRewardModel). The training script uses this registry to instantiate the correct model class based on configuration.

Usage

Import in reward model training scripts. Used to resolve the cfg.model.reward_model_type configuration string to the actual model class. The training script also selects the matching dataset builder based on the model type.

Code Reference

Source Location

Repository: NeMo Aligner
File: nemo_aligner/models/nlp/gpt/reward_model_classes.py
Lines: L1-31 (full file)

Signature

class RewardModelType(enum.Enum):
    BINARY_RANKING = "binary_ranking"
    REGRESSION = "regression"

REWARD_MODEL_CLASS_DICT = {
    RewardModelType.BINARY_RANKING: MegatronGPTRewardModel,
    RewardModelType.REGRESSION: MegatronGPTRegressionRewardModel,
}

Import

from nemo_aligner.models.nlp.gpt.reward_model_classes import (
    RewardModelType,
    REWARD_MODEL_CLASS_DICT,
)

I/O Contract

Inputs

Name	Type	Required	Description
`reward_model_type`	`str`	Yes	Type string: "binary_ranking" or "regression"

Outputs

Name	Type	Description
`model_class`	`Type[Model]`	MegatronGPTRewardModel or MegatronGPTRegressionRewardModel

Usage Examples

from nemo_aligner.models.nlp.gpt.reward_model_classes import (
    RewardModelType,
    REWARD_MODEL_CLASS_DICT,
)

# Resolve model type from config
reward_model_type = RewardModelType(cfg.model.reward_model_type)
model_cls = REWARD_MODEL_CLASS_DICT[reward_model_type]

# Instantiate the selected reward model
model = load_from_nemo(model_cls, model_cfg, trainer, restore_path=restore_path)

Related Pages

Knowledge Sources

NeMo Aligner

NLP | Model_Architecture

2026-02-07 00:00 GMT

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment