Principle:Hpcaitech ColossalAI Preference Dataloader Setup
| Knowledge Sources | |
|---|---|
| Domains | NLP, Data_Engineering |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
A data loading pattern that creates batches of preference pairs with stateful distributed sampling for DPO training across multiple GPUs.
Description
Preference Dataloader Setup creates PyTorch DataLoaders that yield batches containing both chosen and rejected sequences. It uses a DataCollatorForPreferenceDataset to pad and collate preference pairs into uniform-length batches, and a StatefulDistributedSampler that maintains its position across training resumptions.
Usage
Use this after preference data preparation and before the DPO training loop. The stateful sampler enables training resumption from the exact data position.
Theoretical Basis
The data loading must handle parallel sequences:
- Load Arrow datasets containing both chosen and rejected tokenized sequences
- Apply DataCollatorForPreferenceDataset to pad sequences to uniform batch length
- Use StatefulDistributedSampler to shard data across GPUs with resumption capability
- Each batch yields: chosen_input_ids, chosen_attention_mask, chosen_loss_mask, rejected_input_ids, rejected_attention_mask, rejected_loss_mask