Principle:LLMBook zh LLMBook zh github io SFT Training Execution

Knowledge Sources	HuggingFace Trainer LLMBook-zh
Domains	NLP, Training
Last Updated	2026-02-08 00:00 GMT

Overview

The managed training loop for supervised fine-tuning that trains a language model on instruction-response pairs with selective loss masking.

Description

SFT Training Execution combines all SFT components — formatted dataset, loss-masked labels, data collator, and base model — into a HuggingFace Trainer for managed training. The key distinction from pre-training is the use of a DataCollator that handles variable-length padding and the loss masking that focuses learning on response generation.

Usage

Use this after preparing the SFTDataset, DataCollator, and loading the model. This is the final step that actually trains the model.

Theoretical Basis

The SFT training loop is identical to the pre-training loop in structure, but differs in:

Data: Instruction-response pairs instead of raw text blocks.
Loss: Only computed on response tokens (prompt tokens masked with -100).
Collation: Variable-length padding instead of fixed-length blocking.

Related Pages

Implemented By

Implementation:LLMBook_zh_LLMBook_zh_github_io_Trainer_Train_SFT

Uses Heuristic

Heuristic:LLMBook_zh_LLMBook_zh_github_io_BF16_Mixed_Precision_Default

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment