Implementation:Hiyouga LLaMA Factory Launcher
| Knowledge Sources | |
|---|---|
| Domains | CLI, Distributed Training |
| Last Updated | 2026-02-06 19:00 GMT |
Overview
Central CLI command dispatcher and distributed training launcher for the entire LlamaFactory framework, routing user commands to the appropriate subsystem.
Description
The launch function serves as the main entry point for the llamafactory-cli (or lmf) command-line tool. It parses the first CLI argument and dispatches to one of several subsystems: API server, chat interface, model export, training, web chat, web UI, environment info, or version display. For the train command on multi-GPU setups, it automatically orchestrates distributed training via torchrun subprocess calls, supporting both standard multi-node training and elastic launch with fault tolerance (via rendezvous). The module also applies optional PyTorch CUDA memory optimizations when OPTIM_TORCH is enabled.
Usage
Use the launch function as the primary CLI entry point. It is invoked via llamafactory-cli <command> or lmf <command>. For distributed training, configure environment variables such as NNODES, NODE_RANK, NPROC_PER_NODE, MASTER_ADDR, MASTER_PORT, RDZV_ID, MIN_NNODES, and MAX_NNODES to control the distributed topology.
Code Reference
Source Location
- Repository: Hiyouga_LLaMA_Factory
- File: src/llamafactory/launcher.py
- Lines: 1-185
Signature
def launch() -> None:
"""CLI command dispatcher and distributed training launcher."""
...
Import
from llamafactory.launcher import launch
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| sys.argv[1] | str | No (defaults to "help") | CLI command: api, chat, export, train, webchat, webui, env, version, or help |
| FORCE_TORCHRUN | env var | No | Force torchrun-based distributed launch for training |
| NNODES | env var | No (default: "1") | Number of nodes for distributed training |
| NODE_RANK | env var | No (default: "0") | Rank of the current node |
| NPROC_PER_NODE | env var | No | Number of processes per node (defaults to detected GPU count) |
| MASTER_ADDR | env var | No (default: "127.0.0.1") | Master node address for distributed training |
| MASTER_PORT | env var | No | Master node port (auto-selected if not set) |
| RDZV_ID | env var | No | Rendezvous ID for elastic launch mode |
| MIN_NNODES | env var | No | Minimum number of nodes for elastic training |
| MAX_NNODES | env var | No | Maximum number of nodes for elastic training |
| OPTIM_TORCH | env var | No (default: "1") | Enable PyTorch CUDA memory optimizations |
Outputs
| Name | Type | Description |
|---|---|---|
| (side effect) | None | Dispatches to the appropriate subsystem or launches a torchrun subprocess |
| sys.exit code | int | Exit code from the torchrun subprocess (for distributed training) |
Usage Examples
# From the command line (standard usage)
# llamafactory-cli train config.yaml
# llamafactory-cli api --model_name_or_path meta-llama/Llama-3
# llamafactory-cli chat --model_name_or_path meta-llama/Llama-3
# llamafactory-cli export --model_name_or_path meta-llama/Llama-3
# Multi-GPU distributed training (auto-detected)
# CUDA_VISIBLE_DEVICES=0,1,2,3 llamafactory-cli train config.yaml
# Multi-node distributed training
# NNODES=2 NODE_RANK=0 MASTER_ADDR=10.0.0.1 MASTER_PORT=29500 llamafactory-cli train config.yaml
# Elastic launch with fault tolerance
# RDZV_ID=my_job NNODES=2 MIN_NNODES=1 MAX_NNODES=4 llamafactory-cli train config.yaml
Related Pages
- Hiyouga_LLaMA_Factory_Training_Args - Training arguments consumed during the train command
- Hiyouga_LLaMA_Factory_Model_Loader - Model loading invoked by the train and inference commands