Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Ggml org Ggml Cpu backend api

From Leeroopedia


Metadata

Field Value
Page Type Implementation (API Header)
Knowledge Sources GGML
Domains ML_Infrastructure, Tensor_Computing
Last Updated 2026-02-10 12:00 GMT

Overview

Declares the CPU backend interface, including compute plan creation, thread pool management, NUMA configuration, CPU feature detection, and data type conversion functions.

Description

ggml-cpu.h (151 lines) defines the always-available CPU backend API. As the default execution backend, it is the foundation upon which all other backends build. The header provides:

Compute plan (ggml_cplan):

  • work_size / work_data -- scratch buffer calculated by ggml_graph_plan() and allocated by caller
  • n_threads / threadpool -- parallelism configuration
  • abort_callback -- allows early termination of computation
  • use_ref -- forces reference implementations for testing

NUMA configuration:

  • ggml_numa_strategy enum -- DISABLED, DISTRIBUTE, ISOLATE, NUMACTL, MIRROR
  • ggml_numa_init() / ggml_is_numa() -- detect and configure NUMA topology

Thread pool management:

  • ggml_threadpool_new/free/pause/resume/get_n_threads -- lifecycle and control

Graph computation:

  • ggml_graph_plan() -- compute required work buffer size
  • ggml_graph_compute() -- execute the computation graph
  • ggml_graph_compute_with_ctx() -- convenience wrapper using context memory

CPU feature detection (SIMD):

  • x86: SSE3, SSSE3, AVX, AVX2, AVX-VNNI, AVX-512, AVX-512-VBMI/VNNI/BF16, AMX-INT8, BMI2, F16C, FMA
  • ARM: NEON, ARM FMA, FP16 VA, DOTPROD, MATMUL_INT8, SVE, SME
  • Other: RISC-V V, VSX, VXE, WASM SIMD, llamafile

Type traits:

  • ggml_type_traits_cpu -- per-type from_float, vec_dot, vec_dot_type, and nrows
  • ggml_get_type_traits_cpu() -- retrieves CPU-specific type traits

Data conversion:

  • ggml_cpu_fp32_to_fp16/bf16/i32 and reverse conversions

Usage

Include this header in any code that needs to run GGML computation graphs on the CPU, configure threading, detect CPU SIMD capabilities, or manage compute plans.

Code Reference

Source Location

GGML repo, file: include/ggml-cpu.h, 151 lines.

Signature

// CPU backend lifecycle
GGML_BACKEND_API ggml_backend_t ggml_backend_cpu_init(void);
GGML_BACKEND_API bool ggml_backend_is_cpu(ggml_backend_t backend);
GGML_BACKEND_API void ggml_backend_cpu_set_n_threads(ggml_backend_t backend_cpu,
                                                      int n_threads);
GGML_BACKEND_API ggml_backend_reg_t ggml_backend_cpu_reg(void);

// Graph computation
GGML_BACKEND_API struct ggml_cplan ggml_graph_plan(
    const struct ggml_cgraph * cgraph, int n_threads,
    struct ggml_threadpool * threadpool);
GGML_BACKEND_API enum ggml_status ggml_graph_compute(
    struct ggml_cgraph * cgraph, struct ggml_cplan * cplan);

// CPU feature detection (representative subset)
GGML_BACKEND_API int ggml_cpu_has_avx2(void);
GGML_BACKEND_API int ggml_cpu_has_neon(void);
GGML_BACKEND_API int ggml_cpu_has_sve(void);

// Type traits
GGML_BACKEND_API const struct ggml_type_traits_cpu *
    ggml_get_type_traits_cpu(enum ggml_type type);

Import

#include "ggml-cpu.h"

Dependencies

  • ggml.h -- core GGML types
  • ggml-backend.h -- backend abstraction types

I/O Contract

Inputs

Parameter Type Required Description
cgraph const ggml_cgraph * Yes (for graph_plan) Computation graph to plan or execute.
n_threads int Yes (for graph_plan) Number of threads for parallel execution.
threadpool ggml_threadpool * No Optional thread pool (NULL for default).
cplan ggml_cplan * Yes (for graph_compute) Pre-computed plan with allocated work buffer.

Outputs

Output Type Description
Backend handle ggml_backend_t Initialized CPU backend instance.
Compute plan ggml_cplan Plan struct with required work_size for the given graph.
Status ggml_status Success or failure of graph computation.
Feature flag int 1 if the CPU supports the queried SIMD feature, 0 otherwise.

Usage Examples

CPU Backend with Graph Computation

#include "ggml-cpu.h"

// Initialize CPU backend
ggml_backend_t cpu = ggml_backend_cpu_init();
ggml_backend_cpu_set_n_threads(cpu, 4);

// Plan and compute a graph
struct ggml_cplan plan = ggml_graph_plan(graph, 4, NULL);
plan.work_data = malloc(plan.work_size);
ggml_graph_compute(graph, &plan);
free(plan.work_data);

Feature Detection

#include "ggml-cpu.h"

if (ggml_cpu_has_avx2()) {
    printf("AVX2 available\n");
}
if (ggml_cpu_has_neon()) {
    printf("ARM NEON available\n");
}

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment