Implementation:Hiyouga LLaMA Factory Launcher

Knowledge Sources	Hiyouga_LLaMA_Factory
Domains	CLI, Distributed Training
Last Updated	2026-02-06 19:00 GMT

Overview

Central CLI command dispatcher and distributed training launcher for the entire LlamaFactory framework, routing user commands to the appropriate subsystem.

Description

The launch function serves as the main entry point for the llamafactory-cli (or lmf) command-line tool. It parses the first CLI argument and dispatches to one of several subsystems: API server, chat interface, model export, training, web chat, web UI, environment info, or version display. For the train command on multi-GPU setups, it automatically orchestrates distributed training via torchrun subprocess calls, supporting both standard multi-node training and elastic launch with fault tolerance (via rendezvous). The module also applies optional PyTorch CUDA memory optimizations when OPTIM_TORCH is enabled.

Usage

Use the launch function as the primary CLI entry point. It is invoked via llamafactory-cli <command> or lmf <command>. For distributed training, configure environment variables such as NNODES, NODE_RANK, NPROC_PER_NODE, MASTER_ADDR, MASTER_PORT, RDZV_ID, MIN_NNODES, and MAX_NNODES to control the distributed topology.

Code Reference

Source Location

Repository: Hiyouga_LLaMA_Factory
File: src/llamafactory/launcher.py
Lines: 1-185

Signature

def launch() -> None:
    """CLI command dispatcher and distributed training launcher."""
    ...

Import

from llamafactory.launcher import launch

I/O Contract

Inputs

Name	Type	Required	Description
sys.argv[1]	str	No (defaults to "help")	CLI command: api, chat, export, train, webchat, webui, env, version, or help
FORCE_TORCHRUN	env var	No	Force torchrun-based distributed launch for training
NNODES	env var	No (default: "1")	Number of nodes for distributed training
NODE_RANK	env var	No (default: "0")	Rank of the current node
NPROC_PER_NODE	env var	No	Number of processes per node (defaults to detected GPU count)
MASTER_ADDR	env var	No (default: "127.0.0.1")	Master node address for distributed training
MASTER_PORT	env var	No	Master node port (auto-selected if not set)
RDZV_ID	env var	No	Rendezvous ID for elastic launch mode
MIN_NNODES	env var	No	Minimum number of nodes for elastic training
MAX_NNODES	env var	No	Maximum number of nodes for elastic training
OPTIM_TORCH	env var	No (default: "1")	Enable PyTorch CUDA memory optimizations

Outputs

Name	Type	Description
(side effect)	None	Dispatches to the appropriate subsystem or launches a torchrun subprocess
sys.exit code	int	Exit code from the torchrun subprocess (for distributed training)

Usage Examples

# From the command line (standard usage)
# llamafactory-cli train config.yaml
# llamafactory-cli api --model_name_or_path meta-llama/Llama-3
# llamafactory-cli chat --model_name_or_path meta-llama/Llama-3
# llamafactory-cli export --model_name_or_path meta-llama/Llama-3

# Multi-GPU distributed training (auto-detected)
# CUDA_VISIBLE_DEVICES=0,1,2,3 llamafactory-cli train config.yaml

# Multi-node distributed training
# NNODES=2 NODE_RANK=0 MASTER_ADDR=10.0.0.1 MASTER_PORT=29500 llamafactory-cli train config.yaml

# Elastic launch with fault tolerance
# RDZV_ID=my_job NNODES=2 MIN_NNODES=1 MAX_NNODES=4 llamafactory-cli train config.yaml

Related Pages

Hiyouga_LLaMA_Factory_Training_Args - Training arguments consumed during the train command
Hiyouga_LLaMA_Factory_Model_Loader - Model loading invoked by the train and inference commands

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment