Implementation:Alibaba ROLL SFTWorker Train Step
Appearance
| Knowledge Sources | |
|---|---|
| Domains | Supervised_Learning, Distributed_Training |
| Last Updated | 2026-02-07 20:00 GMT |
Overview
Concrete SFT training step and loss function provided by the Alibaba ROLL library.
Description
The SFTWorker.train_step method dispatches a training step through the configured strategy (Megatron/DeepSpeed/FSDP2). The loss_func computes cross-entropy loss on response tokens using the labels tensor with prompt positions masked as -100.
Usage
Called by the SFT pipeline for each training batch.
Code Reference
Source Location
- Repository: Alibaba ROLL
- File: roll/pipeline/sft/sft_worker.py
- Lines: L31-73
Signature
class SFTWorker(Worker):
@register(Dispatch.DP_MP_DISPATCH_FIRST, clear_cache=False)
def train_step(self, data: DataProto) -> DataProto:
"""
Single SFT training step.
Args:
data: DataProto with input_ids, attention_mask, position_ids, labels
Returns:
DataProto with metrics (sft_train/loss@sum, learning_rate)
"""
def loss_func(
self,
data: DataProto,
output_tensor: torch.Tensor
) -> Tuple[torch.Tensor, Dict]:
"""
Compute SFT cross-entropy loss on response tokens.
Args:
data: DataProto with labels
output_tensor: Model logits
Returns:
(loss, metrics_dict)
"""
Import
from roll.pipeline.sft.sft_worker import SFTWorker
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| data | DataProto | Yes | Batch with input_ids, attention_mask, labels (masked) |
Outputs
| Name | Type | Description |
|---|---|---|
| metrics | Dict | sft_train/loss@sum, learning_rate |
Usage Examples
# Called via cluster dispatch in the SFT pipeline:
results = sft_train.execute_all_sync("train_step", batch)
Related Pages
Implements Principle
Requires Environment
Environment Dependencies
This implementation requires the following environment constraints:
Heuristics Applied
This implementation uses the following heuristics:
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment