Environment:Hiyouga LLaMA Factory Core Python GPU Environment

Knowledge Sources	LLaMA-Factory pyproject.toml
Domains	Infrastructure, Deep_Learning
Last Updated	2026-02-06 20:00 GMT

Overview

Python 3.11+ environment with PyTorch 2.4+, Transformers 4.51+, and GPU acceleration (CUDA/NPU/XPU/MPS) for LLM fine-tuning and inference.

Description

This environment provides the core runtime for LLaMA Factory. It requires Python 3.11 or higher and a modern PyTorch installation (2.4+). The framework supports multiple accelerator backends: NVIDIA CUDA, Huawei Ascend NPU, Intel XPU, and Apple MPS. Device selection is automatic via transformers.utils detection functions. The software stack includes HuggingFace Transformers, Datasets, Accelerate, PEFT, and TRL as mandatory dependencies with strict version pinning.

Usage

Use this environment for all LLaMA Factory operations: training (SFT, DPO, KTO, PPO, RM, PT), inference (CLI, API, WebUI), evaluation, and model export. This is the mandatory base prerequisite for every workflow in the repository.

System Requirements

Category	Requirement	Notes
OS	Linux (Ubuntu recommended)	Windows not officially supported; macOS via MPS
Python	>= 3.11.0	Supports 3.11, 3.12, 3.13
Hardware	GPU with >= 8GB VRAM	NVIDIA (CUDA), Huawei Ascend (NPU), Intel (XPU), or Apple (MPS)
Disk	>= 20GB SSD	For model weights and dataset caching

Dependencies

Core Packages

torch >= 2.4.0
torchvision >= 0.19.0
torchaudio >= 2.4.0
transformers >= 4.51.0, <= 5.0.0 (excluding 4.52.0 and 4.57.0)
datasets >= 2.16.0, <= 4.0.0
accelerate >= 1.3.0, <= 1.11.0
peft >= 0.18.0, <= 0.18.1
trl >= 0.18.0, <= 0.24.0
torchdata >= 0.10.0, <= 0.11.0

GUI and Visualization

gradio >= 4.38.0, <= 5.50.0
matplotlib >= 3.7.0
tyro < 0.9.0

Operations

einops
numpy
pandas
scipy

Model and Tokenizer

sentencepiece
tiktoken
modelscope
hf-transfer
safetensors

API Server

uvicorn
fastapi
sse-starlette

Python Utilities

av >= 10.0.0, <= 16.0.0
fire
omegaconf
packaging
protobuf
pyyaml
pydantic

Credentials

The following environment variables may be required depending on usage:

HF_TOKEN: HuggingFace API token for accessing gated models and datasets.
USE_MODELSCOPE_HUB: Set to 1 to download models from ModelScope instead of HuggingFace.
USE_OPENMIND_HUB: Set to 1 to download models from OpenMind Hub.
LLAMAFACTORY_VERBOSITY: Logging verbosity level (default: INFO).
DISABLE_VERSION_CHECK: Set to 1 to skip dependency version validation.
ALLOW_EXTRA_ARGS: Set to 1 to allow unrecognized CLI arguments.

Quick Install

pip install llamafactory

# Or install from source:
git clone https://github.com/hiyouga/LLaMA-Factory.git
cd LLaMA-Factory
pip install -e .

Code Evidence

Dependency version checking from src/llamafactory/extras/misc.py:95-101:

def check_dependencies() -> None:
    r"""Check the version of the required packages."""
    check_version("transformers>=4.51.0,<=5.0.0")
    check_version("datasets>=2.16.0,<=4.0.0")
    check_version("accelerate>=1.3.0,<=1.11.0")
    check_version("peft>=0.18.0,<=0.18.1")
    check_version("trl>=0.18.0,<=0.24.0")

Device auto-detection from src/llamafactory/extras/misc.py:144-157:

def get_current_device() -> "torch.device":
    r"""Get the current available device."""
    if is_torch_xpu_available():
        device = "xpu:{}".format(os.getenv("LOCAL_RANK", "0"))
    elif is_torch_npu_available():
        device = "npu:{}".format(os.getenv("LOCAL_RANK", "0"))
    elif is_torch_mps_available():
        device = "mps:{}".format(os.getenv("LOCAL_RANK", "0"))
    elif is_torch_cuda_available():
        device = "cuda:{}".format(os.getenv("LOCAL_RANK", "0"))
    else:
        device = "cpu"
    return torch.device(device)

Python version requirement from pyproject.toml:11:

requires-python = ">=3.11.0"

Common Errors

Error Message	Cause	Solution
`transformers>=4.51.0,<=5.0.0 is required`	Wrong transformers version	`pip install transformers>=4.51.0,<=5.0.0`
`Please launch distributed training with llamafactory-cli or torchrun`	Single-process mode without KTransformers	Use `llamafactory-cli train` instead of direct `python`
`Please use FORCE_TORCHRUN=1 to launch DeepSpeed training`	DeepSpeed without torchrun	Set `FORCE_TORCHRUN=1` environment variable
`Please specify max_steps in streaming mode`	Streaming dataset without max_steps	Add `max_steps` to training config

Compatibility Notes

NVIDIA CUDA: Primary supported platform. Supports fp16 and bf16 mixed precision.
Huawei Ascend NPU: Supported with torch_npu. JIT compilation is disabled by default (set NPU_JIT_COMPILE=1 to enable). Uses spawn multiprocessing method for vLLM workers.
Intel XPU: Supported via torch.xpu.
Apple MPS: Basic support. Bf16 availability depends on hardware.
CPU: Fallback when no accelerator is detected. No mixed precision support.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment