Environment:Ggml org Llama cpp Vulkan GPU Environment
| Knowledge Sources | |
|---|---|
| Domains | Infrastructure, GPU_Acceleration |
| Last Updated | 2026-02-14 22:00 GMT |
Overview
Cross-platform Vulkan GPU acceleration environment requiring the Vulkan SDK with glslc shader compiler for GPU-accelerated inference on NVIDIA, AMD, and Intel GPUs.
Description
This environment enables GPU-accelerated inference via the Vulkan graphics/compute API. Unlike CUDA (NVIDIA-only) or Metal (Apple-only), Vulkan is cross-platform and supports GPUs from multiple vendors. The backend compiles GLSL compute shaders using glslc and supports cooperative matrix extensions for tensor operations. It is a good alternative when CUDA is not available (e.g., AMD GPUs on Linux/Windows).
Usage
Use this environment for GPU-accelerated inference on non-Apple, non-NVIDIA hardware or when a vendor-neutral solution is preferred. Enable by building with -DGGML_VULKAN=ON. Works on Linux and Windows with any Vulkan-capable GPU.
System Requirements
| Category | Requirement | Notes |
|---|---|---|
| OS | Linux or Windows | macOS uses Metal instead |
| GPU | Any Vulkan 1.1+ compatible GPU | NVIDIA, AMD, or Intel |
| Vulkan SDK | LunarG Vulkan SDK with glslc | Version 1.3.283.0+ recommended |
| CMake | >= 3.19 | For Vulkan shader compilation support |
Dependencies
System Packages
- Vulkan SDK (from LunarG or system package manager)
glslcshader compiler (included in Vulkan SDK)- Vulkan ICD loader and driver for your GPU
Credentials
No credentials are required.
Quick Install
# Install Vulkan SDK (Ubuntu/Debian)
wget -qO- https://packages.lunarg.com/lunarg-signing-key-pub.asc | sudo tee /etc/apt/trusted.gpg.d/lunarg.asc
sudo wget -qO /etc/apt/sources.list.d/lunarg-vulkan-jammy.list https://packages.lunarg.com/vulkan/lunarg-vulkan-jammy.list
sudo apt update && sudo apt install vulkan-sdk
# Build with Vulkan support
cmake -B build -DGGML_VULKAN=ON
cmake --build build --config Release -j $(nproc)
Code Evidence
Vulkan backend configuration from ggml/src/ggml-vulkan/CMakeLists.txt:
find_package(Vulkan COMPONENTS glslc REQUIRED)
option(GGML_VULKAN "ggml: use Vulkan" OFF)
option(GGML_VULKAN_CHECK_RESULTS "ggml: run Vulkan op checks" OFF)
option(GGML_VULKAN_DEBUG "ggml: enable Vulkan debug" OFF)
option(GGML_VULKAN_VALIDATE "ggml: enable Vulkan validate" OFF)
Shader extension support from ggml/src/ggml-vulkan/CMakeLists.txt:
GL_KHR_cooperative_matrix
GL_NV_cooperative_matrix2
GL_EXT_integer_dot_product
GL_EXT_bfloat16
Runtime environment variables from ggml/src/ggml-vulkan/ggml-vulkan.cpp:
// Device selection
getenv("GGML_VK_VISIBLE_DEVICES")
// Memory configuration
getenv("GGML_VK_PREFER_HOST_MEMORY")
getenv("GGML_VK_DISABLE_HOST_VISIBLE_VIDMEM")
getenv("GGML_VK_ALLOW_SYSMEM_FALLBACK")
Common Errors
| Error Message | Cause | Solution |
|---|---|---|
Could NOT find Vulkan |
Vulkan SDK not installed | Install LunarG Vulkan SDK |
glslc not found |
Shader compiler missing | Install glslc from Vulkan SDK or shaderc package
|
VK_ERROR_OUT_OF_DEVICE_MEMORY |
Model too large for GPU VRAM | Reduce layers offloaded or use smaller quantization |
Failed to create Vulkan instance |
Missing Vulkan ICD driver | Install GPU vendor Vulkan driver (nvidia-vulkan-icd, mesa-vulkan-drivers) |
Compatibility Notes
- NVIDIA GPUs: Vulkan works but CUDA is generally faster due to optimized cuBLAS kernels.
- AMD GPUs: Vulkan is the recommended backend on Linux. ROCm/HIP is an alternative requiring ROCm >= 6.1.
- Intel GPUs: Vulkan supported via Mesa drivers. SYCL backend may be faster for Intel discrete GPUs.
- Multi-GPU: Device selection via
GGML_VK_VISIBLE_DEVICESenvironment variable. - Cooperative Matrix: Automatically detected if supported by GPU/driver. Can be disabled with
GGML_VK_DISABLE_COOPMAT.