Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Hpcaitech ColossalAI Preference Dataloader Setup

From Leeroopedia


Knowledge Sources
Domains NLP, Data_Engineering
Last Updated 2026-02-09 00:00 GMT

Overview

A data loading pattern that creates batches of preference pairs with stateful distributed sampling for DPO training across multiple GPUs.

Description

Preference Dataloader Setup creates PyTorch DataLoaders that yield batches containing both chosen and rejected sequences. It uses a DataCollatorForPreferenceDataset to pad and collate preference pairs into uniform-length batches, and a StatefulDistributedSampler that maintains its position across training resumptions.

Usage

Use this after preference data preparation and before the DPO training loop. The stateful sampler enables training resumption from the exact data position.

Theoretical Basis

The data loading must handle parallel sequences:

  1. Load Arrow datasets containing both chosen and rejected tokenized sequences
  2. Apply DataCollatorForPreferenceDataset to pad sequences to uniform batch length
  3. Use StatefulDistributedSampler to shard data across GPUs with resumption capability
  4. Each batch yields: chosen_input_ids, chosen_attention_mask, chosen_loss_mask, rejected_input_ids, rejected_attention_mask, rejected_loss_mask

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment