Environment:Ggml org Llama cpp CMake Build Environment
| Knowledge Sources | |
|---|---|
| Domains | Infrastructure, Build_System |
| Last Updated | 2026-02-14 22:00 GMT |
Overview
CMake-based C/C++ build environment requiring CMake 3.14+, a C11 compiler, and a C++17 compiler for building llama.cpp from source.
Description
This environment provides the core build toolchain for compiling the llama.cpp library and all associated tools (server, quantize, perplexity, embedding, etc.). The build system uses CMake with support for multiple backends that can be enabled at configure time. The default build produces CPU-only binaries; GPU acceleration requires enabling the appropriate backend flag (GGML_CUDA, GGML_METAL, GGML_VULKAN, etc.).
Usage
Use this environment for any workflow that requires building llama.cpp from source. This includes building the quantization tool, the inference server, CLI tools, and the core library. It is the mandatory prerequisite for the Quantize_CMake_Build and Server_CMake_Build implementations.
System Requirements
| Category | Requirement | Notes |
|---|---|---|
| OS | Linux, macOS, Windows, BSD, Android | Cross-platform; see platform-specific notes |
| CMake | >= 3.14 | Required for add_link_options; CUDA backend needs >= 3.18 |
| C Compiler | C11 standard | GCC, Clang, or MSVC |
| C++ Compiler | C++17 standard | GCC, Clang, or MSVC |
| Disk | ~500 MB | For source, build artifacts, and compiled binaries |
Dependencies
System Packages
cmake>= 3.14makeorninja(build tool)gcc/g++orclang/clang++(C11/C++17 support)git(for cloning the repository)ccacheorsccache(optional, auto-detected for faster rebuilds)libssl-dev(optional, for HTTPS/TLS features on Debian/Ubuntu)
Optional Backend Dependencies
- CUDA Toolkit (for GGML_CUDA)
- Vulkan SDK with
glslc(for GGML_VULKAN) - ROCm >= 6.1 (for GGML_HIP)
- Intel oneAPI Base Toolkit (for GGML_SYCL)
- OpenCL headers and ICD loader (for GGML_OPENCL)
Credentials
No credentials are required for building from source. HuggingFace tokens are only needed at runtime for model download.
Quick Install
# Clone repository
git clone https://github.com/ggml-org/llama.cpp
cd llama.cpp
# CPU-only build
cmake -B build
cmake --build build --config Release -j $(nproc)
# CUDA build
cmake -B build -DGGML_CUDA=ON
cmake --build build --config Release -j $(nproc)
# Metal build (macOS, auto-enabled)
cmake -B build
cmake --build build --config Release -j $(sysctl -n hw.logicalcpu)
Code Evidence
CMake minimum version requirement from CMakeLists.txt:1:
cmake_minimum_required(VERSION 3.14) # for add_link_options and implicit target directories.
project("llama.cpp" C CXX)
C/C++ standard requirements from ggml/src/CMakeLists.txt:
set(CMAKE_C_STANDARD 11)
set(CMAKE_C_STANDARD_REQUIRED true)
set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_STANDARD_REQUIRED true)
Backend selection from ggml/CMakeLists.txt:148-176:
option(GGML_CUDA "ggml: use CUDA" OFF)
option(GGML_HIP "ggml: use HIP" OFF)
option(GGML_VULKAN "ggml: use Vulkan" OFF)
option(GGML_SYCL "ggml: use SYCL" OFF)
option(GGML_OPENCL "ggml: use OpenCL" OFF)
Common Errors
| Error Message | Cause | Solution |
|---|---|---|
CMake Error: CMake 3.14 or higher is required |
CMake version too old | Install CMake >= 3.14 from cmake.org or package manager |
error: use of undeclared identifier with C++17 features |
Compiler too old | Upgrade to GCC >= 7 or Clang >= 5 for C++17 support |
CMAKE_CUDA_ARCHITECTURES must be non-empty |
CUDA backend enabled but no GPU architecture detected | Set -DCMAKE_CUDA_ARCHITECTURES=native or specify explicitly (e.g., 86)
|
Could NOT find Vulkan |
Vulkan SDK not installed | Install Vulkan SDK from LunarG or system package manager |
Compatibility Notes
- Windows: Requires MSVC or Clang with C++17 support. The
/utf-8and/bigobjflags are automatically added. - Windows ARM64: Clang compiler recommended (
-DCMAKE_CXX_COMPILER=clang++). - macOS: Metal backend is enabled by default. Accelerate framework is auto-linked for BLAS.
- MinGW: Shared libraries are disabled by default. Static linking recommended.
- Emscripten: Special WebAssembly build with 64-bit memory support enabled by default.
- Cross-compilation:
GGML_NATIVEauto-detection is disabled when cross-compiling.