Implementation:NVIDIA NeMo Aligner Build RM Datasets
| Implementation Details | |
|---|---|
| Name | Build_RM_Datasets |
| Type | API Doc |
| Implements | Reward_Model_Data_Preparation |
| Repository | NeMo Aligner |
| Primary File | nemo_aligner/data/nlp/builders.py |
| Domains | NLP, Data_Engineering |
| Last Updated | 2026-02-07 00:00 GMT |
Overview
Concrete tool for constructing reward model comparison datasets from preference JSONL files provided by the NeMo Aligner data builders module.
Description
The build_train_valid_test_rm_datasets function is a partial application of the generic build_train_valid_test_datasets factory, specialized to create RewardModelDataset instances. The RewardModelDataset class (datasets.py:L126-298) handles tokenization of chosen/rejected response pairs, padding to equal lengths, and construction of comparison tensors. It supports both plain text and conversation-format inputs.
Usage
Import when setting up reward model training. Called after model loading to create train/validation/test datasets for the Bradley-Terry ranking objective.
Code Reference
Source Location
- Repository: NeMo Aligner
- File:
nemo_aligner/data/nlp/builders.py(L393 partial),nemo_aligner/data/nlp/datasets.py(L126-298 RewardModelDataset)
Signature
# Partial application in builders.py
build_train_valid_test_rm_datasets = partial(build_train_valid_test_datasets, RewardModelDataset)
# Underlying function signature:
def build_train_valid_test_datasets(
cls, # RewardModelDataset
cfg: DictConfig,
data_prefix,
data_impl: str,
splits_string: str,
train_valid_test_num_samples: list,
seq_length: int,
seed: int,
tokenizer,
) -> Tuple[Dataset, Dataset, Dataset]:
# RewardModelDataset class:
class RewardModelDataset(Dataset):
def __init__(self, cfg, tokenizer, name, data_prefix, documents, data, seq_length, seed):
...
def __getitem__(self, idx) -> dict:
# Returns: {"chosen": Tensor, "rejected": Tensor, "chosen_length": int, "rejected_length": int}
Import
from nemo_aligner.data.nlp.builders import build_train_valid_test_rm_datasets
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
cfg |
DictConfig |
Yes | Data configuration with file paths, formats |
data_prefix |
str |
Yes | Path to JSONL data files |
data_impl |
str |
Yes | Data format (jsonl) |
splits_string |
str |
Yes | Train/val/test split ratios |
train_valid_test_num_samples |
list |
Yes | Number of samples per split |
seq_length |
int |
Yes | Maximum sequence length |
seed |
int |
Yes | Random seed |
tokenizer |
TokenizerSpec |
Yes | Model tokenizer |
Outputs
| Name | Type | Description |
|---|---|---|
train_ds |
RewardModelDataset |
Training dataset |
val_ds |
RewardModelDataset |
Validation dataset |
test_ds |
RewardModelDataset |
Test dataset |
Usage Examples
from nemo_aligner.data.nlp.builders import build_train_valid_test_rm_datasets
train_ds, val_ds, test_ds = build_train_valid_test_rm_datasets(
cfg=cfg.model.data,
data_prefix=cfg.model.data.data_prefix,
data_impl="jsonl",
splits_string="980,10,10",
train_valid_test_num_samples=[10000, 500, 500],
seq_length=cfg.model.data.seq_length,
seed=cfg.model.seed,
tokenizer=model.tokenizer,
)
Related Pages
- Principle:NVIDIA_NeMo_Aligner_Reward_Model_Data_Preparation
- Environment:NVIDIA_NeMo_Aligner_NeMo_Framework_GPU_Environment