Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:ContextualAI HALOs Dataset Loaders

From Leeroopedia


Knowledge Sources
Domains Data_Engineering, NLP
Last Updated 2026-02-08 03:00 GMT

Overview

Concrete tool for loading and normalizing preference, binary, and SFT datasets provided by the HALOs data module.

Description

The train/data.py module defines the Example dataclass and Dataset collection class, along with 12+ get_{name} loader functions that parse datasets from HuggingFace Hub or local JSON files into the common Example schema. Supported datasets include SHP, Anthropic HH, UltraFeedback, OASST, UltraBin, AlpacaEval, SafeRLHF, WildBench, and s1K. Additional loaders handle sampled data (get_sampled_data) and labeled feedback (get_feedback) produced by the online alignment loop.

Usage

Import and call get_{name}(split) to load any supported dataset. The DataLoader classes in train/dataloader.py call these functions internally when initialized with dataset names.

Code Reference

Source Location

Signature

@dataclass
class Example:
    prompt: List = field(default_factory=list)
    prompt_id: int = -1
    generations: List = field(default_factory=list)
    sft_index: int = -1
    scores: List[float] = field(default_factory=list)
    pairs: List[Tuple[int, int]] = field(default_factory=list)
    desirable: List[bool] = field(default_factory=list)
    dataset_name: str = ''
    original_prompt: str = ''

class Dataset:
    def __init__(self, name: str):
        self.name = name
        self.data = defaultdict(Example)

# Representative loader function signature:
def get_shp(split: str, human_prefix: str = 'user', human_suffix: str = '',
            assistant_prefix: str = 'assistant', assistant_suffix: str = '') -> Dataset:
    """Load the Stanford Human Preferences dataset."""
    ...

def get_sampled_data(split: str, ...) -> Dataset:
    """Load data that was sampled using train.sample."""
    ...

def get_feedback(split: str, ...) -> Dataset:
    """Load labeled feedback data (pairwise or binary) from a JSON file."""
    ...

Import

from train.data import Example, Dataset
from train import data as data_module
# Dynamic dispatch: getattr(data_module, f'get_{name}')(split)

I/O Contract

Inputs

Name Type Required Description
split str Yes One of 'train' or 'test'
dataset name str Yes Resolved via get_{name} dispatch (e.g., 'shp', 'hh', 'ultrabin', 'alpacaeval')
JSON file path str No For local data: path used as dataset name, parsed by get_feedback or get_sampled_data

Outputs

Name Type Description
dataset Dataset Collection of Example objects indexed by prompt hash
Example.prompt List[Dict] Multi-turn conversation with 'role' and 'content' keys
Example.generations List[List[Dict]] Candidate responses as lists of turns
Example.pairs List[Tuple[int, int]] Preference pairs (preferred_idx, dispreferred_idx)
Example.desirable List[bool] Binary labels per generation
Example.scores List[float] Scalar scores per generation
Example.sft_index int Index of the SFT target generation

Usage Examples

Loading a HuggingFace Dataset

from train.data import Dataset
from train import data as data_module

# Load Stanford Human Preferences for training
dataset = data_module.get_shp('train')

# Access an example by its prompt hash
for prompt_id, example in dataset.data.items():
    print(f"Prompt: {example.prompt[0]['content'][:100]}...")
    print(f"Num generations: {example.num_generations()}")
    print(f"Preference pairs: {example.pairs}")
    break

Loading Online Feedback Data

from train import data as data_module

# Load pairwise feedback from the online alignment loop
dataset = data_module.get_feedback('train', path='feedback_round1.json')

for prompt_id, example in dataset.data.items():
    print(f"Pairs: {example.pairs}")
    print(f"Desirable: {example.desirable}")
    break

Related Pages

Implements Principle

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment