Environment:Ggml org Llama cpp CMake Build Environment

Knowledge Sources	llama.cpp Build Documentation
Domains	Infrastructure, Build_System
Last Updated	2026-02-14 22:00 GMT

Overview

CMake-based C/C++ build environment requiring CMake 3.14+, a C11 compiler, and a C++17 compiler for building llama.cpp from source.

Description

This environment provides the core build toolchain for compiling the llama.cpp library and all associated tools (server, quantize, perplexity, embedding, etc.). The build system uses CMake with support for multiple backends that can be enabled at configure time. The default build produces CPU-only binaries; GPU acceleration requires enabling the appropriate backend flag (GGML_CUDA, GGML_METAL, GGML_VULKAN, etc.).

Usage

Use this environment for any workflow that requires building llama.cpp from source. This includes building the quantization tool, the inference server, CLI tools, and the core library. It is the mandatory prerequisite for the Quantize_CMake_Build and Server_CMake_Build implementations.

System Requirements

Category	Requirement	Notes
OS	Linux, macOS, Windows, BSD, Android	Cross-platform; see platform-specific notes
CMake	>= 3.14	Required for add_link_options; CUDA backend needs >= 3.18
C Compiler	C11 standard	GCC, Clang, or MSVC
C++ Compiler	C++17 standard	GCC, Clang, or MSVC
Disk	~500 MB	For source, build artifacts, and compiled binaries

Dependencies

System Packages

cmake >= 3.14
make or ninja (build tool)
gcc / g++ or clang / clang++ (C11/C++17 support)
git (for cloning the repository)
ccache or sccache (optional, auto-detected for faster rebuilds)
libssl-dev (optional, for HTTPS/TLS features on Debian/Ubuntu)

Optional Backend Dependencies

CUDA Toolkit (for GGML_CUDA)
Vulkan SDK with glslc (for GGML_VULKAN)
ROCm >= 6.1 (for GGML_HIP)
Intel oneAPI Base Toolkit (for GGML_SYCL)
OpenCL headers and ICD loader (for GGML_OPENCL)

Credentials

No credentials are required for building from source. HuggingFace tokens are only needed at runtime for model download.

Quick Install

# Clone repository
git clone https://github.com/ggml-org/llama.cpp
cd llama.cpp

# CPU-only build
cmake -B build
cmake --build build --config Release -j $(nproc)

# CUDA build
cmake -B build -DGGML_CUDA=ON
cmake --build build --config Release -j $(nproc)

# Metal build (macOS, auto-enabled)
cmake -B build
cmake --build build --config Release -j $(sysctl -n hw.logicalcpu)

Code Evidence

CMake minimum version requirement from CMakeLists.txt:1:

cmake_minimum_required(VERSION 3.14) # for add_link_options and implicit target directories.
project("llama.cpp" C CXX)

C/C++ standard requirements from ggml/src/CMakeLists.txt:

set(CMAKE_C_STANDARD 11)
set(CMAKE_C_STANDARD_REQUIRED true)
set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_STANDARD_REQUIRED true)

Backend selection from ggml/CMakeLists.txt:148-176:

option(GGML_CUDA    "ggml: use CUDA"    OFF)
option(GGML_HIP     "ggml: use HIP"     OFF)
option(GGML_VULKAN  "ggml: use Vulkan"  OFF)
option(GGML_SYCL    "ggml: use SYCL"    OFF)
option(GGML_OPENCL  "ggml: use OpenCL"  OFF)

Common Errors

Error Message	Cause	Solution
`CMake Error: CMake 3.14 or higher is required`	CMake version too old	Install CMake >= 3.14 from cmake.org or package manager
`error: use of undeclared identifier` with C++17 features	Compiler too old	Upgrade to GCC >= 7 or Clang >= 5 for C++17 support
`CMAKE_CUDA_ARCHITECTURES must be non-empty`	CUDA backend enabled but no GPU architecture detected	Set `-DCMAKE_CUDA_ARCHITECTURES=native` or specify explicitly (e.g., `86`)
`Could NOT find Vulkan`	Vulkan SDK not installed	Install Vulkan SDK from LunarG or system package manager

Compatibility Notes

Windows: Requires MSVC or Clang with C++17 support. The /utf-8 and /bigobj flags are automatically added.
Windows ARM64: Clang compiler recommended (-DCMAKE_CXX_COMPILER=clang++).
macOS: Metal backend is enabled by default. Accelerate framework is auto-linked for BLAS.
MinGW: Shared libraries are disabled by default. Static linking recommended.
Emscripten: Special WebAssembly build with 64-bit memory support enabled by default.
Cross-compilation: GGML_NATIVE auto-detection is disabled when cross-compiling.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment