Principle:Alibaba ROLL SFT Worker Initialization
| Knowledge Sources | |
|---|---|
| Domains | Distributed_Systems, Supervised_Learning |
| Last Updated | 2026-02-07 20:00 GMT |
Overview
A distributed initialization principle for setting up a single training cluster for supervised fine-tuning.
Description
SFT Worker Initialization creates a single cluster (sft_train) with SFTWorker instances. Unlike RL pipelines that require multiple clusters (actor, critic, reference), SFT only needs one trainable cluster. The cluster is initialized with the configured training strategy (Megatron, DeepSpeed, or FSDP2).
Usage
Use during SFT pipeline initialization. SFT is the simplest pipeline in terms of cluster setup.
Theoretical Basis
SFT requires only a single model instance for supervised training, unlike RL which needs policy, reference, and optionally critic models.
Related Pages
Implemented By
Related Heuristics
The following heuristics inform this principle: