Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Ggml org Ggml Cpu vec ops

From Leeroopedia


Metadata

Field Value
Page Type Implementation (Vectorized Operations)
Knowledge Sources GGML
Domains ML_Infrastructure, Tensor_Computing, CPU_Backend, SIMD
Last Updated 2025-05-15 12:00 GMT

Overview

Implements SIMD-optimized vector dot products, SiLU activation, softmax, variance, and precomputed GELU lookup tables for the CPU backend.

Description

vec.cpp provides the performance-critical vector operations used heavily in matrix multiplication, attention computation, and activation functions. Key components include:

  1. Precomputed lookup tables: ggml_table_gelu_f16 (128 KB) and ggml_table_gelu_quick_f16 (128 KB) for fast GELU/quick-GELU evaluation via f16 table lookup.
  2. Vector dot products: Architecture-specific SIMD implementations of:
    • ggml_vec_dot_f32 -- f32 dot product with ARM SVE (8-way unrolled FMA with predicated tail), RISC-V vector intrinsics (variable-length vectorization), and generic SIMD via GGML_F32_VEC macros.
    • ggml_vec_dot_bf16 -- bf16 dot product with AVX-512 BF16 (_mm512_dpbf16_ps), AVX2 with bf16-to-f32 shift, NEON, and scalar fallbacks.
    • ggml_vec_dot_f16 -- f16 dot product with architecture-specific paths.
  3. Activation functions: ggml_vec_silu_f32 computes SiLU (Sigmoid Linear Unit) element-wise.
  4. Statistical operations: ggml_vec_cvar_f32 computes centered variance (also centers the output vector), ggml_vec_soft_max_f32 computes numerically stable softmax, and ggml_vec_log_soft_max_f32 computes log-softmax.

Usage

These functions are called by ops.cpp tensor operations and by matrix multiplication routines. They are not typically called directly by user code.

Code Reference

Source Location

GGML repo, file: src/ggml-cpu/vec.cpp (630 lines).

Signature

void ggml_vec_dot_f32(int n, float * GGML_RESTRICT s, size_t bs,
    const float * GGML_RESTRICT x, size_t bx,
    const float * GGML_RESTRICT y, size_t by, int nrc);

void ggml_vec_dot_bf16(int n, float * GGML_RESTRICT s, size_t bs,
    ggml_bf16_t * GGML_RESTRICT x, size_t bx,
    ggml_bf16_t * GGML_RESTRICT y, size_t by, int nrc);

void ggml_vec_dot_f16(int n, float * GGML_RESTRICT s, size_t bs,
    ggml_fp16_t * GGML_RESTRICT x, size_t bx,
    ggml_fp16_t * GGML_RESTRICT y, size_t by, int nrc);

void ggml_vec_silu_f32(const int n, float * y, const float * x);

ggml_float ggml_vec_cvar_f32(const int n, float * y, const float * x, const float mean);

ggml_float ggml_vec_soft_max_f32(const int n, float * y, const float * x, float max);

ggml_float ggml_vec_log_soft_max_f32(const int n, float * y, const float * x, float max);

Import

#include "vec.h"

I/O Contract

Inputs

Parameter Type Required Description
n int Yes Number of elements in the input vectors.
x, y const float * (or f16/bf16) Yes Input vectors for dot product or element-wise operation.
nrc int Yes (dot) Number of rows to compute (typically 1).
max float Yes (softmax) Maximum value for numerical stability in softmax computation.

Outputs

Output Type Description
s float * Scalar dot product result.
y float * Element-wise output vector (SiLU, softmax, variance).
Return value ggml_float Sum for softmax, variance for cvar.

Usage Examples

Computing a Float Dot Product

#include "vec.h"

float x[256] = { /* ... */ };
float y[256] = { /* ... */ };
float result;

ggml_vec_dot_f32(256, &result, 0, x, 0, y, 0, 1);
// result now contains the dot product of x and y

Computing Softmax

#include "vec.h"

float logits[1024] = { /* ... */ };
float probs[1024];
float max_val = /* compute max of logits */;

ggml_float sum = ggml_vec_soft_max_f32(1024, probs, logits, max_val);
// probs[] now contains exp(logits[i] - max) and sum is the total

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment