Implementation:Ggml org Ggml Cpu vec api
Metadata
| Field | Value |
|---|---|
| Page Type | Implementation (Vector API Header) |
| Knowledge Sources | GGML |
| Domains | ML_Infrastructure, Tensor_Computing, CPU_Backend, SIMD |
| Last Updated | 2025-05-15 12:00 GMT |
Overview
Header declaring vectorized fundamental operations (dot products, element-wise arithmetic, GELU, softmax) with inline SIMD-optimized implementations for common vector functions.
Description
vec.h is the central vectorized math library for the CPU backend, providing the building blocks used by ops.cpp, binary-ops.cpp, and the matrix multiplication routines. It includes:
- Extern function declarations:
ggml_vec_dot_f32,ggml_vec_dot_bf16,ggml_vec_dot_f16,ggml_vec_silu_f32,ggml_vec_cvar_f32,ggml_vec_soft_max_f32,ggml_vec_log_soft_max_f32(implemented invec.cpp). - Inline element-wise operations: A large set of inline functions for common math operations across f32, f16, and bf16 types:
- Arithmetic:
ggml_vec_add_f32(AVX2-optimized),ggml_vec_sub_f32,ggml_vec_mul_f32,ggml_vec_div_f32, and their f16 variants. - Assignment:
ggml_vec_set_f32,ggml_vec_cpy_f32,ggml_vec_neg_f32. - Scaling/MAD:
ggml_vec_scale_f32,ggml_vec_scale_f16,ggml_vec_mad_f32(multiply-accumulate with SIMD unrolling). - Norms:
ggml_vec_norm_f32(L2 norm),ggml_vec_norm_inv_f32. - Reductions:
ggml_vec_sum_f32,ggml_vec_sum_f32_ggf,ggml_vec_sum_f16_ggf,ggml_vec_max_f32,ggml_vec_argmax_f32.
- Arithmetic:
- GELU functions:
ggml_vec_gelu_f32andggml_vec_gelu_quick_f32using precomputed f16 lookup tables for fast evaluation. - Unroll constants:
GGML_SOFT_MAX_UNROLL(4),GGML_VEC_DOT_UNROLL(2),GGML_VEC_MAD_UNROLL(32). - Apple Accelerate integration: When
GGML_USE_ACCELERATEis defined, several functions are redirected to Apple's vDSP routines.
Usage
Include vec.h in any CPU backend source file that needs access to vectorized math operations. Most functions are inline and will be compiled directly into the call site.
Code Reference
Source Location
GGML repo, file: src/ggml-cpu/vec.h (1585 lines).
Signature
// Dot products (extern, implemented in vec.cpp)
void ggml_vec_dot_f32(int n, float * s, size_t bs, const float * x, size_t bx,
const float * y, size_t by, int nrc);
// Inline element-wise arithmetic
inline static void ggml_vec_add_f32(const int n, float * z, const float * x, const float * y);
inline static void ggml_vec_sub_f32(const int n, float * z, const float * x, const float * y);
inline static void ggml_vec_mul_f32(const int n, float * z, const float * x, const float * y);
inline static void ggml_vec_div_f32(const int n, float * z, const float * x, const float * y);
inline static void ggml_vec_scale_f32(const int n, float * y, const float v);
inline static void ggml_vec_mad_f32(const int n, float * y, const float * x, const float v);
// GELU activation via lookup table
inline static void ggml_vec_gelu_f32(const int n, float * y, const float * x);
inline static void ggml_vec_gelu_quick_f32(const int n, float * y, const float * x);
// Reductions
inline static void ggml_vec_sum_f32(const int n, float * s, const float * x);
inline static void ggml_vec_max_f32(const int n, float * s, const float * x);
inline static int ggml_vec_argmax_f32(const int n, const float * x);
Import
#include "vec.h"
I/O Contract
Inputs
| Parameter | Type | Required | Description |
|---|---|---|---|
n |
int |
Yes | Number of elements in the input/output vectors. |
x, y |
const float * |
Yes | Input vectors. |
v |
float |
Conditional | Scalar value for scale/mad/set operations. |
Outputs
| Output | Type | Description |
|---|---|---|
z or y |
float * |
Element-wise result vector. |
s |
float * |
Scalar reduction result (sum, max, dot product). |
Usage Examples
Element-wise Vector Operations
#include "vec.h"
float a[256], b[256], c[256];
// c = a + b
ggml_vec_add_f32(256, c, a, b);
// c = a * b
ggml_vec_mul_f32(256, c, a, b);
// a *= 0.5
ggml_vec_scale_f32(256, a, 0.5f);
// b += a * 2.0 (multiply-accumulate)
ggml_vec_mad_f32(256, b, a, 2.0f);
GELU Activation
#include "vec.h"
float input[1024], output[1024];
// Apply GELU using precomputed f16 table
ggml_vec_gelu_f32(1024, output, input);
Related Pages
- Ggml_org_Ggml_Cpu_vec_ops -- Implementation file for the extern functions declared here.
- Ggml_org_Ggml_Cpu_simd_mappings -- SIMD macros that these inline functions depend on.
- Ggml_org_Ggml_Cpu_tensor_ops -- Tensor operations that use these vector primitives.
- Ggml_org_Ggml_Cpu_unary_ops -- Unary operations that complement these vector functions.