Implementation:Alibaba ROLL SFTConfig
Appearance
| Knowledge Sources | |
|---|---|
| Domains | Supervised_Learning, Configuration |
| Last Updated | 2026-02-07 20:00 GMT |
Overview
Concrete SFT configuration dataclass provided by the Alibaba ROLL library.
Description
The SFTConfig class extends BaseConfig with SFT-specific parameters including dataset field mappings, training hyperparameters, and a single worker configuration for the sft_train cluster.
Usage
Loaded from YAML via Hydra for SFT pipelines.
Code Reference
Source Location
- Repository: Alibaba ROLL
- File: roll/pipeline/sft/sft_config.py
- Lines: L9-64
Signature
@dataclass
class SFTConfig(BaseConfig):
"""
Configuration for SFT training.
Attributes:
pretrain: str - path to pretrained model
prompt_key: str = "instruction" - dataset prompt key
response_key: str = "output" - dataset response key
system_key: Optional[str] - system prompt key
query_key: Optional[str] - query key
sft_train: WorkerConfig - SFT worker configuration
max_grad_norm: float = 1.0 - gradient clipping
"""
Import
from roll.pipeline.sft.sft_config import SFTConfig
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| YAML config file | str | Yes | Hydra-managed YAML configuration |
Outputs
| Name | Type | Description |
|---|---|---|
| SFTConfig instance | SFTConfig | Validated config with sft_train WorkerConfig |
Usage Examples
from hydra import compose, initialize
import dacite
from omegaconf import OmegaConf
initialize(config_path="examples/qwen2.5-7B-sft_megatron")
cfg = compose(config_name="sft_config")
config = dacite.from_dict(data_class=SFTConfig, data=OmegaConf.to_container(cfg, resolve=True))
Related Pages
Implements Principle
Requires Environment
Environment Dependencies
This implementation requires the following environment constraints:
Heuristics Applied
No specific heuristics apply to this implementation.
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment