Environment:Ggml org Llama cpp Vulkan GPU Environment

Knowledge Sources	llama.cpp Build Documentation
Domains	Infrastructure, GPU_Acceleration
Last Updated	2026-02-14 22:00 GMT

Overview

Cross-platform Vulkan GPU acceleration environment requiring the Vulkan SDK with glslc shader compiler for GPU-accelerated inference on NVIDIA, AMD, and Intel GPUs.

Description

This environment enables GPU-accelerated inference via the Vulkan graphics/compute API. Unlike CUDA (NVIDIA-only) or Metal (Apple-only), Vulkan is cross-platform and supports GPUs from multiple vendors. The backend compiles GLSL compute shaders using glslc and supports cooperative matrix extensions for tensor operations. It is a good alternative when CUDA is not available (e.g., AMD GPUs on Linux/Windows).

Usage

Use this environment for GPU-accelerated inference on non-Apple, non-NVIDIA hardware or when a vendor-neutral solution is preferred. Enable by building with -DGGML_VULKAN=ON. Works on Linux and Windows with any Vulkan-capable GPU.

System Requirements

Category	Requirement	Notes
OS	Linux or Windows	macOS uses Metal instead
GPU	Any Vulkan 1.1+ compatible GPU	NVIDIA, AMD, or Intel
Vulkan SDK	LunarG Vulkan SDK with glslc	Version 1.3.283.0+ recommended
CMake	>= 3.19	For Vulkan shader compilation support

Dependencies

System Packages

Vulkan SDK (from LunarG or system package manager)
glslc shader compiler (included in Vulkan SDK)
Vulkan ICD loader and driver for your GPU

Credentials

No credentials are required.

Quick Install

# Install Vulkan SDK (Ubuntu/Debian)
wget -qO- https://packages.lunarg.com/lunarg-signing-key-pub.asc | sudo tee /etc/apt/trusted.gpg.d/lunarg.asc
sudo wget -qO /etc/apt/sources.list.d/lunarg-vulkan-jammy.list https://packages.lunarg.com/vulkan/lunarg-vulkan-jammy.list
sudo apt update && sudo apt install vulkan-sdk

# Build with Vulkan support
cmake -B build -DGGML_VULKAN=ON
cmake --build build --config Release -j $(nproc)

Code Evidence

Vulkan backend configuration from ggml/src/ggml-vulkan/CMakeLists.txt:

find_package(Vulkan COMPONENTS glslc REQUIRED)
option(GGML_VULKAN              "ggml: use Vulkan"             OFF)
option(GGML_VULKAN_CHECK_RESULTS "ggml: run Vulkan op checks"  OFF)
option(GGML_VULKAN_DEBUG        "ggml: enable Vulkan debug"    OFF)
option(GGML_VULKAN_VALIDATE     "ggml: enable Vulkan validate" OFF)

Shader extension support from ggml/src/ggml-vulkan/CMakeLists.txt:

GL_KHR_cooperative_matrix
GL_NV_cooperative_matrix2
GL_EXT_integer_dot_product
GL_EXT_bfloat16

Runtime environment variables from ggml/src/ggml-vulkan/ggml-vulkan.cpp:

// Device selection
getenv("GGML_VK_VISIBLE_DEVICES")
// Memory configuration
getenv("GGML_VK_PREFER_HOST_MEMORY")
getenv("GGML_VK_DISABLE_HOST_VISIBLE_VIDMEM")
getenv("GGML_VK_ALLOW_SYSMEM_FALLBACK")

Common Errors

Error Message	Cause	Solution
`Could NOT find Vulkan`	Vulkan SDK not installed	Install LunarG Vulkan SDK
`glslc not found`	Shader compiler missing	Install `glslc` from Vulkan SDK or `shaderc` package
`VK_ERROR_OUT_OF_DEVICE_MEMORY`	Model too large for GPU VRAM	Reduce layers offloaded or use smaller quantization
`Failed to create Vulkan instance`	Missing Vulkan ICD driver	Install GPU vendor Vulkan driver (nvidia-vulkan-icd, mesa-vulkan-drivers)

Compatibility Notes

NVIDIA GPUs: Vulkan works but CUDA is generally faster due to optimized cuBLAS kernels.
AMD GPUs: Vulkan is the recommended backend on Linux. ROCm/HIP is an alternative requiring ROCm >= 6.1.
Intel GPUs: Vulkan supported via Mesa drivers. SYCL backend may be faster for Intel discrete GPUs.
Multi-GPU: Device selection via GGML_VK_VISIBLE_DEVICES environment variable.
Cooperative Matrix: Automatically detected if supported by GPU/driver. Can be disabled with GGML_VK_DISABLE_COOPMAT.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment