Principle:LLMBook zh LLMBook zh github io SFT Training Execution
| Knowledge Sources | |
|---|---|
| Domains | NLP, Training |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
The managed training loop for supervised fine-tuning that trains a language model on instruction-response pairs with selective loss masking.
Description
SFT Training Execution combines all SFT components — formatted dataset, loss-masked labels, data collator, and base model — into a HuggingFace Trainer for managed training. The key distinction from pre-training is the use of a DataCollator that handles variable-length padding and the loss masking that focuses learning on response generation.
Usage
Use this after preparing the SFTDataset, DataCollator, and loading the model. This is the final step that actually trains the model.
Theoretical Basis
The SFT training loop is identical to the pre-training loop in structure, but differs in:
- Data: Instruction-response pairs instead of raw text blocks.
- Loss: Only computed on response tokens (prompt tokens masked with -100).
- Collation: Variable-length padding instead of fixed-length blocking.