Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Alibaba ROLL SFT Dataset Preparation

From Leeroopedia


Knowledge Sources
Domains Data_Processing, Supervised_Learning
Last Updated 2026-02-07 20:00 GMT

Overview

A data preprocessing principle for converting instruction-response datasets into label-masked, shifted sequences for causal language model fine-tuning.

Description

SFT Dataset Preparation tokenizes instruction-response pairs using chat templates, masks prompt tokens with IGNORE_INDEX (-100) so they do not contribute to the loss, and shifts labels left by one position for next-token prediction. The DataCollatorForSFT handles padding and label shifting during batching.

Usage

Use when preparing data for supervised fine-tuning of causal language models.

Theoretical Basis

Label masking ensures only response tokens contribute to the loss:

  • Prompt tokens: label = -100 (ignored)
  • Response tokens: label = next token ID (standard causal LM objective)

Related Pages

Implemented By

Related Heuristics

No specific heuristics inform this principle.

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment