Environment:Ggml org Llama cpp Python Conversion Environment

Knowledge Sources	llama.cpp Model Conversion README
Domains	Infrastructure, Model_Conversion
Last Updated	2026-02-14 22:00 GMT

Overview

Python 3.9+ environment with PyTorch, transformers, sentencepiece, and the gguf library for converting HuggingFace models to GGUF format and converting LoRA adapters.

Description

This environment provides the Python runtime and package dependencies required for all model conversion scripts in llama.cpp. The primary scripts are convert_hf_to_gguf.py (11,934 lines, handles 100+ model architectures) and convert_lora_to_gguf.py (493 lines, converts LoRA adapters). The environment requires PyTorch for loading safetensors/bin model weights and transformers for model configuration parsing.

Usage

Use this environment for the HF-to-GGUF Model Conversion and LoRA Adapter Workflow workflows. It is the mandatory prerequisite for running convert_hf_to_gguf.py, convert_lora_to_gguf.py, model inspection scripts, and logit comparison tools.

System Requirements

Category	Requirement	Notes
OS	Linux, macOS, Windows	Python scripts are cross-platform
Python	>= 3.9	Defined in pyproject.toml
RAM	2x model size	Models are loaded into RAM during conversion
Disk	2x model size	Source model + output GGUF file

Dependencies

System Packages

python3 >= 3.9
pip (Python package manager)

Python Packages

numpy ~= 1.26.4
sentencepiece >= 0.1.98, < 0.3.0
transformers >= 4.57.1, < 5.0.0
gguf >= 0.1.0
protobuf >= 4.21.0, < 5.0.0
torch ~= 2.6.0 (standard platforms)
torch >= 0.0.0.dev0 (s390x nightly builds only)

Credentials

The following environment variables may be needed for accessing gated or private models:

HF_TOKEN: HuggingFace API token (Read access) for downloading gated models like Llama
HUGGINGFACE_HUB_TOKEN: Alternative to HF_TOKEN (legacy compatibility)

Quick Install

# Install all required packages for model conversion
pip install numpy~=1.26.4 "sentencepiece>=0.1.98,<0.3.0" "transformers>=4.57.1,<5.0.0" "gguf>=0.1.0" "protobuf>=4.21.0,<5.0.0"

# Install PyTorch (CPU-only, sufficient for conversion)
pip install torch --index-url https://download.pytorch.org/whl/cpu

# Or install everything from requirements file
pip install -r requirements/requirements-convert_hf_to_gguf.txt

Code Evidence

Python version requirement from pyproject.toml:8:

[tool.poetry.dependencies]
python = ">=3.9"
numpy = "^1.25.0"

Core Python dependencies from requirements/requirements-convert_legacy_llama.txt:1-7:

numpy~=1.26.4
sentencepiece>=0.1.98,<0.3.0
transformers>=4.57.1,<5.0.0
gguf>=0.1.0
protobuf>=4.21.0,<5.0.0

Platform-specific PyTorch from requirements/requirements-convert_hf_to_gguf.txt:4-9:

## Embedding Gemma requires PyTorch 2.6.0 or later
torch~=2.6.0; platform_machine != "s390x"
# torch s390x packages can only be found from nightly builds
torch>=0.0.0.dev0; platform_machine == "s390x"

HuggingFace token usage from gguf-py/gguf/utility.py:268-269:

token = os.environ.get("HF_TOKEN")
headers = {"Authorization": f"Bearer {token}"} if token else {}

Common Errors

Error Message	Cause	Solution
`ModuleNotFoundError: No module named 'transformers'`	Missing Python dependencies	Run `pip install -r requirements/requirements-convert_hf_to_gguf.txt`
`ImportError: protobuf`	Protobuf version mismatch	Set `PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python` and reinstall protobuf
`torch not compiled with CUDA`	Wrong PyTorch build	CPU-only PyTorch is sufficient for conversion; ignore this warning
`Access denied` on gated model	Missing HuggingFace token	Set `HF_TOKEN` env var with a token that has access to the model

Compatibility Notes

s390x (IBM Z): Requires nightly PyTorch builds from https://download.pytorch.org/whl/nightly. Standard PyTorch wheels are not available.
CPU-only PyTorch: Sufficient for all conversion tasks. GPU PyTorch is not needed for converting models to GGUF.
NO_LOCAL_GGUF: Setting this environment variable skips the local gguf package import and uses the pip-installed version instead.
MODEL_ENDPOINT: Defaults to https://huggingface.co/. Can be overridden to point to alternative model repositories.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment