Principle:ContextualAI HALOs Data Preparation

Knowledge Sources	ContextualAI HALOs Training language models to follow instructions A General Language Assistant as a Laboratory for Alignment
Domains	Data_Engineering, NLP
Last Updated	2026-02-08 03:00 GMT

Overview

A unified data abstraction that normalizes heterogeneous preference, binary feedback, and SFT datasets into a common schema for alignment training.

Description

LLM alignment requires datasets in several feedback formats: paired preferences (response A is better than response B), binary feedback (this response is desirable/undesirable), and SFT targets (generate this specific response). Different public datasets provide feedback in different formats and structures.

The data preparation principle addresses this heterogeneity by defining a universal Example dataclass that captures all feedback modalities in a single schema. Each Example contains a multi-turn prompt, a list of candidate generations, paired preference indices, binary desirability labels, scalar scores, and an SFT target index. Dataset-specific loader functions (get_{name}) parse raw data from HuggingFace or local JSON files into this common format.

This normalization enables downstream components (DataLoaders, Trainers) to operate on a single interface regardless of the original data source.

Usage

Use this principle whenever loading training or evaluation data for any HALOs workflow. It is the mandatory first data processing step for SFT training, preference alignment (DPO, KTO, GRPO, etc.), reward model training, and online iterative alignment.

Theoretical Basis

The core abstraction is the Example dataclass:

# Abstract schema (not implementation)
Example:
    prompt: List[Turn]              # Multi-turn conversation context
    generations: List[List[Turn]]   # Candidate responses
    pairs: List[Tuple[int, int]]    # (preferred_idx, dispreferred_idx) for paired feedback
    desirable: List[bool]           # Binary feedback per generation
    scores: List[float]             # Scalar scores per generation
    sft_index: int                  # Index of the SFT target in generations
    dataset_name: str               # Source dataset identifier

A Dataset is a collection of Examples indexed by prompt hash, ensuring each unique prompt maps to exactly one Example (aggregating all its generations and feedback signals).

The loader dispatch pattern uses Python's naming convention: for any dataset name X, the system calls get_X(split) to obtain a Dataset object.

Related Pages

Implemented By

Implementation:ContextualAI_HALOs_Dataset_Loaders

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment