Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Environment:Hiyouga LLaMA Factory Core Python GPU Environment

From Leeroopedia


Knowledge Sources
Domains Infrastructure, Deep_Learning
Last Updated 2026-02-06 20:00 GMT

Overview

Python 3.11+ environment with PyTorch 2.4+, Transformers 4.51+, and GPU acceleration (CUDA/NPU/XPU/MPS) for LLM fine-tuning and inference.

Description

This environment provides the core runtime for LLaMA Factory. It requires Python 3.11 or higher and a modern PyTorch installation (2.4+). The framework supports multiple accelerator backends: NVIDIA CUDA, Huawei Ascend NPU, Intel XPU, and Apple MPS. Device selection is automatic via transformers.utils detection functions. The software stack includes HuggingFace Transformers, Datasets, Accelerate, PEFT, and TRL as mandatory dependencies with strict version pinning.

Usage

Use this environment for all LLaMA Factory operations: training (SFT, DPO, KTO, PPO, RM, PT), inference (CLI, API, WebUI), evaluation, and model export. This is the mandatory base prerequisite for every workflow in the repository.

System Requirements

Category Requirement Notes
OS Linux (Ubuntu recommended) Windows not officially supported; macOS via MPS
Python >= 3.11.0 Supports 3.11, 3.12, 3.13
Hardware GPU with >= 8GB VRAM NVIDIA (CUDA), Huawei Ascend (NPU), Intel (XPU), or Apple (MPS)
Disk >= 20GB SSD For model weights and dataset caching

Dependencies

Core Packages

  • torch >= 2.4.0
  • torchvision >= 0.19.0
  • torchaudio >= 2.4.0
  • transformers >= 4.51.0, <= 5.0.0 (excluding 4.52.0 and 4.57.0)
  • datasets >= 2.16.0, <= 4.0.0
  • accelerate >= 1.3.0, <= 1.11.0
  • peft >= 0.18.0, <= 0.18.1
  • trl >= 0.18.0, <= 0.24.0
  • torchdata >= 0.10.0, <= 0.11.0

GUI and Visualization

  • gradio >= 4.38.0, <= 5.50.0
  • matplotlib >= 3.7.0
  • tyro < 0.9.0

Operations

  • einops
  • numpy
  • pandas
  • scipy

Model and Tokenizer

  • sentencepiece
  • tiktoken
  • modelscope
  • hf-transfer
  • safetensors

API Server

  • uvicorn
  • fastapi
  • sse-starlette

Python Utilities

  • av >= 10.0.0, <= 16.0.0
  • fire
  • omegaconf
  • packaging
  • protobuf
  • pyyaml
  • pydantic

Credentials

The following environment variables may be required depending on usage:

  • HF_TOKEN: HuggingFace API token for accessing gated models and datasets.
  • USE_MODELSCOPE_HUB: Set to 1 to download models from ModelScope instead of HuggingFace.
  • USE_OPENMIND_HUB: Set to 1 to download models from OpenMind Hub.
  • LLAMAFACTORY_VERBOSITY: Logging verbosity level (default: INFO).
  • DISABLE_VERSION_CHECK: Set to 1 to skip dependency version validation.
  • ALLOW_EXTRA_ARGS: Set to 1 to allow unrecognized CLI arguments.

Quick Install

pip install llamafactory

# Or install from source:
git clone https://github.com/hiyouga/LLaMA-Factory.git
cd LLaMA-Factory
pip install -e .

Code Evidence

Dependency version checking from src/llamafactory/extras/misc.py:95-101:

def check_dependencies() -> None:
    r"""Check the version of the required packages."""
    check_version("transformers>=4.51.0,<=5.0.0")
    check_version("datasets>=2.16.0,<=4.0.0")
    check_version("accelerate>=1.3.0,<=1.11.0")
    check_version("peft>=0.18.0,<=0.18.1")
    check_version("trl>=0.18.0,<=0.24.0")

Device auto-detection from src/llamafactory/extras/misc.py:144-157:

def get_current_device() -> "torch.device":
    r"""Get the current available device."""
    if is_torch_xpu_available():
        device = "xpu:{}".format(os.getenv("LOCAL_RANK", "0"))
    elif is_torch_npu_available():
        device = "npu:{}".format(os.getenv("LOCAL_RANK", "0"))
    elif is_torch_mps_available():
        device = "mps:{}".format(os.getenv("LOCAL_RANK", "0"))
    elif is_torch_cuda_available():
        device = "cuda:{}".format(os.getenv("LOCAL_RANK", "0"))
    else:
        device = "cpu"
    return torch.device(device)

Python version requirement from pyproject.toml:11:

requires-python = ">=3.11.0"

Common Errors

Error Message Cause Solution
transformers>=4.51.0,<=5.0.0 is required Wrong transformers version pip install transformers>=4.51.0,<=5.0.0
Please launch distributed training with llamafactory-cli or torchrun Single-process mode without KTransformers Use llamafactory-cli train instead of direct python
Please use FORCE_TORCHRUN=1 to launch DeepSpeed training DeepSpeed without torchrun Set FORCE_TORCHRUN=1 environment variable
Please specify max_steps in streaming mode Streaming dataset without max_steps Add max_steps to training config

Compatibility Notes

  • NVIDIA CUDA: Primary supported platform. Supports fp16 and bf16 mixed precision.
  • Huawei Ascend NPU: Supported with torch_npu. JIT compilation is disabled by default (set NPU_JIT_COMPILE=1 to enable). Uses spawn multiprocessing method for vLLM workers.
  • Intel XPU: Supported via torch.xpu.
  • Apple MPS: Basic support. Bf16 availability depends on hardware.
  • CPU: Fallback when no accelerator is detected. No mixed precision support.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment