Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:NVIDIA NeMo Aligner Build RM Datasets

From Leeroopedia


Implementation Details
Name Build_RM_Datasets
Type API Doc
Implements Reward_Model_Data_Preparation
Repository NeMo Aligner
Primary File nemo_aligner/data/nlp/builders.py
Domains NLP, Data_Engineering
Last Updated 2026-02-07 00:00 GMT

Overview

Concrete tool for constructing reward model comparison datasets from preference JSONL files provided by the NeMo Aligner data builders module.

Description

The build_train_valid_test_rm_datasets function is a partial application of the generic build_train_valid_test_datasets factory, specialized to create RewardModelDataset instances. The RewardModelDataset class (datasets.py:L126-298) handles tokenization of chosen/rejected response pairs, padding to equal lengths, and construction of comparison tensors. It supports both plain text and conversation-format inputs.

Usage

Import when setting up reward model training. Called after model loading to create train/validation/test datasets for the Bradley-Terry ranking objective.

Code Reference

Source Location

  • Repository: NeMo Aligner
  • File: nemo_aligner/data/nlp/builders.py (L393 partial), nemo_aligner/data/nlp/datasets.py (L126-298 RewardModelDataset)

Signature

# Partial application in builders.py
build_train_valid_test_rm_datasets = partial(build_train_valid_test_datasets, RewardModelDataset)

# Underlying function signature:
def build_train_valid_test_datasets(
    cls,               # RewardModelDataset
    cfg: DictConfig,
    data_prefix,
    data_impl: str,
    splits_string: str,
    train_valid_test_num_samples: list,
    seq_length: int,
    seed: int,
    tokenizer,
) -> Tuple[Dataset, Dataset, Dataset]:

# RewardModelDataset class:
class RewardModelDataset(Dataset):
    def __init__(self, cfg, tokenizer, name, data_prefix, documents, data, seq_length, seed):
        ...
    def __getitem__(self, idx) -> dict:
        # Returns: {"chosen": Tensor, "rejected": Tensor, "chosen_length": int, "rejected_length": int}

Import

from nemo_aligner.data.nlp.builders import build_train_valid_test_rm_datasets

I/O Contract

Inputs

Name Type Required Description
cfg DictConfig Yes Data configuration with file paths, formats
data_prefix str Yes Path to JSONL data files
data_impl str Yes Data format (jsonl)
splits_string str Yes Train/val/test split ratios
train_valid_test_num_samples list Yes Number of samples per split
seq_length int Yes Maximum sequence length
seed int Yes Random seed
tokenizer TokenizerSpec Yes Model tokenizer

Outputs

Name Type Description
train_ds RewardModelDataset Training dataset
val_ds RewardModelDataset Validation dataset
test_ds RewardModelDataset Test dataset

Usage Examples

from nemo_aligner.data.nlp.builders import build_train_valid_test_rm_datasets

train_ds, val_ds, test_ds = build_train_valid_test_rm_datasets(
    cfg=cfg.model.data,
    data_prefix=cfg.model.data.data_prefix,
    data_impl="jsonl",
    splits_string="980,10,10",
    train_valid_test_num_samples=[10000, 500, 500],
    seq_length=cfg.model.data.seq_length,
    seed=cfg.model.seed,
    tokenizer=model.tokenizer,
)

Related Pages

Knowledge Sources

NLP | Data_Engineering

2026-02-07 00:00 GMT

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment