Implementation:Hiyouga LLaMA Factory V1 Launcher
| Knowledge Sources | |
|---|---|
| Domains | CLI, Distributed Training, DevOps |
| Last Updated | 2026-02-06 19:00 GMT |
Overview
launcher.py is the top-level CLI entry point and distributed training launcher for the LLaMA-Factory v1 system, managing command routing, multi-GPU detection, and torchrun-based distributed execution.
Description
The launcher module provides two main functions: launch() and main(). The launch() function serves as the initial CLI entry point that parses commands (sft, dpo, rm, chat, help, version, env), auto-detects multi-GPU setups, and re-launches the process via torchrun for distributed training with elastic launch support (configurable via RDZV_ID, MIN_NNODES, MAX_NNODES environment variables). It supports multi-node training through NNODES, NODE_RANK, MASTER_ADDR, and MASTER_PORT environment variables. The main() function handles the torchrun-spawned worker process, routing the command to the appropriate trainer (SFT, DPO, or RM). Environment optimizations for CUDA memory allocation and NCCL are applied when OPTIM_TORCH is enabled.
Usage
Use via the llamafactory-cli (or lmf) command-line tool. For training, run commands like llamafactory-cli sft config.yaml. For single-GPU training, the launcher runs the trainer directly. For multi-GPU, it automatically wraps execution with torchrun. For chat, use llamafactory-cli chat.
Code Reference
Source Location
- Repository: Hiyouga_LLaMA_Factory
- File: src/llamafactory/v1/launcher.py
- Lines: 1-179
Signature
def launch() -> None: ...
def main() -> None: ...
# Module-level constants
USAGE: str # Help text string
_DIST_TRAIN_COMMANDS: tuple # ("train", "sft", "dpo", "rm")
Import
from llamafactory.v1.launcher import launch, main
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| sys.argv | list[str] | Yes | Command-line arguments. argv[1] is the command (sft/dpo/rm/chat/help/version/env), followed by config or args. |
| NNODES | env var | No | Number of nodes for multi-node training (default: "1"). |
| NODE_RANK | env var | No | Rank of the current node (default: "0"). |
| NPROC_PER_NODE | env var | No | Number of processes per node (default: GPU count). |
| MASTER_ADDR | env var | No | Master node address (default: "127.0.0.1"). |
| MASTER_PORT | env var | No | Master node port (default: auto-detected available port). |
| FORCE_TORCHRUN | env var | No | Force torchrun even on single GPU. |
| OPTIM_TORCH | env var | No | Enable CUDA/NCCL optimizations (default: "1"). |
| MAX_RESTARTS | env var | No | Maximum restarts for elastic launch (default: "0"). |
| RDZV_ID | env var | No | Rendezvous ID for elastic launch. When set, enables elastic job mode. |
| MIN_NNODES | env var | No | Minimum number of nodes for elastic scaling. |
| MAX_NNODES | env var | No | Maximum number of nodes for elastic scaling. |
Outputs
| Name | Type | Description |
|---|---|---|
| Process exit code | int | 0 on success, non-zero on failure. For distributed training, returns the torchrun exit code. |
Usage Examples
# CLI usage (shell commands)
# Single-GPU SFT training
# llamafactory-cli sft config.yaml
# Multi-GPU auto-detected distributed training
# llamafactory-cli sft config.yaml (auto-launches torchrun if >1 GPU)
# Multi-node training
# NNODES=2 NODE_RANK=0 MASTER_ADDR=10.0.0.1 llamafactory-cli sft config.yaml
# Elastic launch
# RDZV_ID=my_job MIN_NNODES=1 MAX_NNODES=4 llamafactory-cli sft config.yaml
# Interactive chat
# llamafactory-cli chat --model path/to/model
# Direct Python usage
from llamafactory.v1.launcher import launch
launch()
Related Pages
- Hiyouga_LLaMA_Factory_V1_Base_Trainer - The trainer classes that launch() routes to.
- Hiyouga_LLaMA_Factory_V1_Model_Engine - Model initialization triggered during training.
- Hiyouga_LLaMA_Factory_V1_Data_Engine - Data loading triggered during training.
- Hiyouga_LLaMA_Factory_V1_Base_Sampler - The sampler used for the chat command.