Implementation:Ggml org Ggml Cpu x86 cpu feats
Metadata
| Field | Value |
|---|---|
| Page Type | Implementation (Architecture-Specific SIMD) |
| Knowledge Sources | GGML |
| Domains | ML_Infrastructure, CPU_Feature_Detection, SIMD_Optimization |
| Last Updated | 2025-05-15 12:00 GMT |
Overview
Runtime x86-64 CPU feature detection via CPUID and compatibility scoring for dynamic backend selection on Intel and AMD processors.
Description
arch/x86/cpu-feats.cpp provides the most detailed CPU feature detection in the GGML codebase, enabling fine-grained runtime selection among x86 backend variants spanning from basic SSE to AVX-512 and AMX acceleration.
The file defines a cpuid_x86 struct that executes CPUID instructions to populate bitset fields for a comprehensive set of x86 ISA extensions:
SSE family: SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2
AVX family: AVX, AVX2, FMA, F16C, AVX-VNNI
AVX-512 family: AVX512F, AVX512CD, AVX512BW, AVX512VL, AVX512DQ, AVX512PF, AVX512ER, AVX512_VBMI, AVX512_VNNI, AVX512_FP16, AVX512_BF16
AMX family: AMX_TILE, AMX_INT8, AMX_FP16, AMX_BF16
Other: PCLMULQDQ, POPCNT, AES, BMI1, BMI2, LZCNT, RDRAND, RDSEED, SHA, and AMD-specific extensions (SSE4a, XOP, TBM, ABM, 3DNow!)
The CPUID execution uses __cpuid/__cpuidex on MSVC or inline assembly (cpuid instruction) on GCC/Clang. The struct identifies the vendor string to differentiate Intel-specific and AMD-specific features.
The ggml_backend_cpu_x86_score function implements a compatibility scoring system: it checks each compile-time feature flag (e.g., GGML_AVX2, GGML_AVX512) against runtime detection results. If any required feature is missing, it returns 0 (incompatible). Otherwise, it returns a cumulative bit-shifted score where higher-tier features contribute more weight, allowing the dynamic loader to select the most capable compatible backend variant.
The score is exported via the GGML_BACKEND_DL_SCORE_IMPL macro for use by the backend dynamic loading system.
Usage
This file is compiled into each x86 CPU backend variant (one per SIMD tier). At runtime, the dynamic backend loader calls the score function for each available variant and selects the one with the highest score that is compatible with the host CPU.
Code Reference
Source Location
GGML repo, file: src/ggml-cpu/arch/x86/cpu-feats.cpp (327 lines).
Key Signatures
struct cpuid_x86 {
bool SSE3(void);
bool AVX(void);
bool AVX2(void);
bool AVX512F(void);
bool AVX512BW(void);
bool AVX512VL(void);
bool AVX512_VNNI(void);
bool AMX_INT8(void);
// ... and many more
};
static int ggml_backend_cpu_x86_score();
GGML_BACKEND_DL_SCORE_IMPL(ggml_backend_cpu_x86_score)
Import
#include "ggml-backend-impl.h"
#include <cstring>
#include <vector>
#include <bitset>
#include <array>
#include <string>
I/O Contract
Inputs
| Parameter | Type | Description |
|---|---|---|
| (none) | -- | The score function takes no parameters. It reads CPU feature bits directly via the CPUID instruction and checks them against compile-time flags. |
Outputs
| Output | Type | Description |
|---|---|---|
| Score | int |
Returns 0 if the compiled backend variant requires features not present on the host CPU. Returns a positive integer score (cumulative bit-shifted value) indicating the capability level of the backend variant. Higher scores indicate more advanced SIMD support. |
Score Weights
| Feature Flag | Score Contribution |
|---|---|
GGML_FMA |
+1 |
GGML_F16C |
+2 |
GGML_SSE42 |
+4 |
GGML_BMI2 |
+8 |
GGML_AVX |
+16 |
GGML_AVX2 |
+32 |
GGML_AVX_VNNI |
+64 |
GGML_AVX512 |
+128 |
GGML_AVX512_VBMI |
+256 |
GGML_AVX512_BF16 |
+512 |
GGML_AVX512_VNNI |
+1024 |
GGML_AMX_INT8 |
+2048 |
Usage Examples
// The score function is called internally by the GGML backend loader.
// It is not typically called directly by user code.
//
// The macro GGML_BACKEND_DL_SCORE_IMPL exports the function as:
// extern "C" int ggml_backend_score(void);
//
// The backend loader enumerates all .so/.dll variants and calls each
// one's score function to select the best match for the host CPU.
Related Pages
- Principle:Ggml_org_Ggml_Architecture_Specific_SIMD_Quantization
- Implementation:Ggml_org_Ggml_Cpu_x86_quants -- x86 quantization routines selected by this feature detection
- Implementation:Ggml_org_Ggml_Cpu_x86_repack -- x86 repack/GEMM routines selected by this feature detection