Implementation:NVIDIA NeMo Aligner Preprocess AnthropicHH Data

Knowledge Sources	NVIDIA_NeMo_Aligner
Domains	KTO, Data Preprocessing, Preference Learning
Last Updated	2026-02-08 00:00 GMT

Overview

A script that downloads the Anthropic Helpful-Harmless dataset and converts it from paired preference format into the binary feedback format required by KTO (Kahneman-Tversky Optimization) training.

Description

preprocess_anthropichh_data.py processes the Anthropic HH-RLHF dataset for KTO training. The script performs:

Dataset loading: Downloads the full Anthropic/hh-rlhf dataset from HuggingFace. If the validation split is requested, it falls back to the test split (since Anthropic HH has no validation set).
Conversation parsing: Each conversation string is split on \n\nHuman: and \n\nAssistant: delimiters to extract the prompt body and response. The parsed text is formatted using simple templates: Human:\n{body}\nAssistant:\n{response}.
Preference unpacking: Each paired comparison (chosen and rejected) is unpacked into two separate dictionaries:
- {"prompt": "...", "response": "...", "preference": "chosen"}
- {"prompt": "...", "response": "...", "preference": "rejected"}
Pairs where the chosen and rejected prompts do not match are discarded.
Output saving: Saves train.jsonl and test.jsonl to the specified output directory, with one JSON object per line.

Note that this script uses a simpler prompt format (Human:\n{body}\nAssistant:\n{response}) compared to the chat-template-based format used in the CAI preprocessing scripts.

Usage

Use this script when:

You need to prepare training data for KTO alignment
You want to convert Anthropic HH paired preferences into binary feedback format
You are setting up the KTO training pipeline in NeMo Aligner

Code Reference

Source Location

Repository: NVIDIA_NeMo_Aligner
File: examples/nlp/data/kto/preprocess_anthropichh_data.py
Lines: 1-123

Signature

process_hh:

def process_hh(split):

save_dataset_for_kto:

def save_dataset_for_kto(list_of_dicts, split, save_dir):

prepare_args:

def prepare_args():

convert_list_of_dict_to_jsonl:

def convert_list_of_dict_to_jsonl(list_of_dict):

Import

from preprocess_anthropichh_data import process_hh, save_dataset_for_kto

I/O Contract

Inputs

Name	Type	Required	Description
--output-dir	`str`	No	Output directory for the generated JSONL files (default: "./")

Outputs

Name	Type	Description
train.jsonl	JSONL file	Training split with unpacked binary preference labels
test.jsonl	JSONL file	Test split with unpacked binary preference labels

Each output line is a JSON object with the following structure:

{
  "prompt": "Human:\nWhat is the meaning of life?\nAssistant:\n",
  "response": "The meaning of life is a philosophical question...",
  "preference": "chosen"
}

Usage Examples

# Command-line usage:
python preprocess_anthropichh_data.py --output-dir /data/kto_processed

# This produces:
#   /data/kto_processed/train.jsonl
#   /data/kto_processed/test.jsonl

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment