Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Alibaba ROLL SFT Worker Initialization

From Leeroopedia


Knowledge Sources
Domains Distributed_Systems, Supervised_Learning
Last Updated 2026-02-07 20:00 GMT

Overview

A distributed initialization principle for setting up a single training cluster for supervised fine-tuning.

Description

SFT Worker Initialization creates a single cluster (sft_train) with SFTWorker instances. Unlike RL pipelines that require multiple clusters (actor, critic, reference), SFT only needs one trainable cluster. The cluster is initialized with the configured training strategy (Megatron, DeepSpeed, or FSDP2).

Usage

Use during SFT pipeline initialization. SFT is the simplest pipeline in terms of cluster setup.

Theoretical Basis

SFT requires only a single model instance for supervised training, unlike RL which needs policy, reference, and optionally critic models.

Related Pages

Implemented By

Related Heuristics

The following heuristics inform this principle:

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment