Implementation:NVIDIA NeMo Aligner Reward Model Class Registry
| Implementation Details | |
|---|---|
| Name | Reward_Model_Class_Registry |
| Type | API Doc |
| Implements | Reward_Model_Architecture_Selection |
| Repository | NeMo Aligner |
| Primary File | nemo_aligner/models/nlp/gpt/reward_model_classes.py |
| Domains | NLP, Model_Architecture |
| Last Updated | 2026-02-07 00:00 GMT |
Overview
Concrete tool for selecting reward model architecture types via an enumeration registry provided by the NeMo Aligner models module.
Description
The RewardModelType enum and REWARD_MODEL_CLASS_DICT dictionary provide a simple registry pattern for selecting between binary ranking and regression reward model architectures. The enum defines the valid type strings, and the dict maps each type to its corresponding model class (MegatronGPTRewardModel or MegatronGPTRegressionRewardModel). The training script uses this registry to instantiate the correct model class based on configuration.
Usage
Import in reward model training scripts. Used to resolve the cfg.model.reward_model_type configuration string to the actual model class. The training script also selects the matching dataset builder based on the model type.
Code Reference
Source Location
- Repository: NeMo Aligner
- File:
nemo_aligner/models/nlp/gpt/reward_model_classes.py - Lines: L1-31 (full file)
Signature
class RewardModelType(enum.Enum):
BINARY_RANKING = "binary_ranking"
REGRESSION = "regression"
REWARD_MODEL_CLASS_DICT = {
RewardModelType.BINARY_RANKING: MegatronGPTRewardModel,
RewardModelType.REGRESSION: MegatronGPTRegressionRewardModel,
}
Import
from nemo_aligner.models.nlp.gpt.reward_model_classes import (
RewardModelType,
REWARD_MODEL_CLASS_DICT,
)
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
reward_model_type |
str |
Yes | Type string: "binary_ranking" or "regression" |
Outputs
| Name | Type | Description |
|---|---|---|
model_class |
Type[Model] |
MegatronGPTRewardModel or MegatronGPTRegressionRewardModel |
Usage Examples
from nemo_aligner.models.nlp.gpt.reward_model_classes import (
RewardModelType,
REWARD_MODEL_CLASS_DICT,
)
# Resolve model type from config
reward_model_type = RewardModelType(cfg.model.reward_model_type)
model_cls = REWARD_MODEL_CLASS_DICT[reward_model_type]
# Instantiate the selected reward model
model = load_from_nemo(model_cls, model_cfg, trainer, restore_path=restore_path)
Related Pages
- Principle:NVIDIA_NeMo_Aligner_Reward_Model_Architecture_Selection
- Environment:NVIDIA_NeMo_Aligner_NeMo_Framework_GPU_Environment