Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Environment:Ggml org Llama cpp CMake Build Environment

From Leeroopedia
Knowledge Sources
Domains Infrastructure, Build_System
Last Updated 2026-02-14 22:00 GMT

Overview

CMake-based C/C++ build environment requiring CMake 3.14+, a C11 compiler, and a C++17 compiler for building llama.cpp from source.

Description

This environment provides the core build toolchain for compiling the llama.cpp library and all associated tools (server, quantize, perplexity, embedding, etc.). The build system uses CMake with support for multiple backends that can be enabled at configure time. The default build produces CPU-only binaries; GPU acceleration requires enabling the appropriate backend flag (GGML_CUDA, GGML_METAL, GGML_VULKAN, etc.).

Usage

Use this environment for any workflow that requires building llama.cpp from source. This includes building the quantization tool, the inference server, CLI tools, and the core library. It is the mandatory prerequisite for the Quantize_CMake_Build and Server_CMake_Build implementations.

System Requirements

Category Requirement Notes
OS Linux, macOS, Windows, BSD, Android Cross-platform; see platform-specific notes
CMake >= 3.14 Required for add_link_options; CUDA backend needs >= 3.18
C Compiler C11 standard GCC, Clang, or MSVC
C++ Compiler C++17 standard GCC, Clang, or MSVC
Disk ~500 MB For source, build artifacts, and compiled binaries

Dependencies

System Packages

  • cmake >= 3.14
  • make or ninja (build tool)
  • gcc / g++ or clang / clang++ (C11/C++17 support)
  • git (for cloning the repository)
  • ccache or sccache (optional, auto-detected for faster rebuilds)
  • libssl-dev (optional, for HTTPS/TLS features on Debian/Ubuntu)

Optional Backend Dependencies

  • CUDA Toolkit (for GGML_CUDA)
  • Vulkan SDK with glslc (for GGML_VULKAN)
  • ROCm >= 6.1 (for GGML_HIP)
  • Intel oneAPI Base Toolkit (for GGML_SYCL)
  • OpenCL headers and ICD loader (for GGML_OPENCL)

Credentials

No credentials are required for building from source. HuggingFace tokens are only needed at runtime for model download.

Quick Install

# Clone repository
git clone https://github.com/ggml-org/llama.cpp
cd llama.cpp

# CPU-only build
cmake -B build
cmake --build build --config Release -j $(nproc)

# CUDA build
cmake -B build -DGGML_CUDA=ON
cmake --build build --config Release -j $(nproc)

# Metal build (macOS, auto-enabled)
cmake -B build
cmake --build build --config Release -j $(sysctl -n hw.logicalcpu)

Code Evidence

CMake minimum version requirement from CMakeLists.txt:1:

cmake_minimum_required(VERSION 3.14) # for add_link_options and implicit target directories.
project("llama.cpp" C CXX)

C/C++ standard requirements from ggml/src/CMakeLists.txt:

set(CMAKE_C_STANDARD 11)
set(CMAKE_C_STANDARD_REQUIRED true)
set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_STANDARD_REQUIRED true)

Backend selection from ggml/CMakeLists.txt:148-176:

option(GGML_CUDA    "ggml: use CUDA"    OFF)
option(GGML_HIP     "ggml: use HIP"     OFF)
option(GGML_VULKAN  "ggml: use Vulkan"  OFF)
option(GGML_SYCL    "ggml: use SYCL"    OFF)
option(GGML_OPENCL  "ggml: use OpenCL"  OFF)

Common Errors

Error Message Cause Solution
CMake Error: CMake 3.14 or higher is required CMake version too old Install CMake >= 3.14 from cmake.org or package manager
error: use of undeclared identifier with C++17 features Compiler too old Upgrade to GCC >= 7 or Clang >= 5 for C++17 support
CMAKE_CUDA_ARCHITECTURES must be non-empty CUDA backend enabled but no GPU architecture detected Set -DCMAKE_CUDA_ARCHITECTURES=native or specify explicitly (e.g., 86)
Could NOT find Vulkan Vulkan SDK not installed Install Vulkan SDK from LunarG or system package manager

Compatibility Notes

  • Windows: Requires MSVC or Clang with C++17 support. The /utf-8 and /bigobj flags are automatically added.
  • Windows ARM64: Clang compiler recommended (-DCMAKE_CXX_COMPILER=clang++).
  • macOS: Metal backend is enabled by default. Accelerate framework is auto-linked for BLAS.
  • MinGW: Shared libraries are disabled by default. Static linking recommended.
  • Emscripten: Special WebAssembly build with 64-bit memory support enabled by default.
  • Cross-compilation: GGML_NATIVE auto-detection is disabled when cross-compiling.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment