Implementation:Ggml org Ggml Cpu unary ops

Metadata

Field	Value
Page Type	Implementation (Unary Operations)
Knowledge Sources	GGML
Domains	ML_Infrastructure, Tensor_Computing, CPU_Backend
Last Updated	2025-05-15 12:00 GMT

Overview

Implements element-wise unary tensor operations (abs, neg, relu, sigmoid, tanh, exp, sqrt, sin, cos, log, etc.) with type-generic template support.

Description

unary-ops.cpp provides all activation functions and element-wise math operations needed for neural network inference on CPU. The implementation uses a two-level template dispatch:

Scalar operation functions: ~25 inline functions implementing the mathematical operation on a single float: op_abs, op_sgn, op_neg, op_step, op_tanh, op_elu, op_relu, op_sigmoid, op_hardsigmoid, op_exp, op_hardswish, op_sqr, op_sqrt, op_xielu, op_sin, op_cos, op_log, op_expm1, op_softplus, op_floor, op_ceil, op_round, op_trunc.
Template dispatch (apply_unary_op): A C++ template apply_unary_op<op, src0_t, dst_t> handles multi-threaded row partitioning using get_thread_range(), and type conversion via type_conversion_table to support f32, f16, and bf16 inputs/outputs.
Type routing (unary_op): A second template unary_op<op> dispatches on source and destination tensor types, selecting the appropriate apply_unary_op instantiation.
Exported functions: Each operation has a public function (e.g., ggml_compute_forward_relu) that calls unary_op<op_relu>.

Usage

These functions are called by the compute engine's dispatch table for the corresponding GGML_OP_* operations. They are not meant to be called directly.

Code Reference

Source Location

GGML repo, file: src/ggml-cpu/unary-ops.cpp (337 lines).

Signature

// Representative exported functions (all follow the same pattern):
void ggml_compute_forward_abs(const ggml_compute_params * params, ggml_tensor * dst);
void ggml_compute_forward_neg(const ggml_compute_params * params, ggml_tensor * dst);
void ggml_compute_forward_relu(const ggml_compute_params * params, ggml_tensor * dst);
void ggml_compute_forward_sigmoid(const ggml_compute_params * params, ggml_tensor * dst);
void ggml_compute_forward_tanh(const ggml_compute_params * params, ggml_tensor * dst);
void ggml_compute_forward_exp(const ggml_compute_params * params, ggml_tensor * dst);
void ggml_compute_forward_sqrt(const ggml_compute_params * params, ggml_tensor * dst);
void ggml_compute_forward_sin(const ggml_compute_params * params, ggml_tensor * dst);
void ggml_compute_forward_cos(const ggml_compute_params * params, ggml_tensor * dst);
void ggml_compute_forward_log(const ggml_compute_params * params, ggml_tensor * dst);

Import

#include "unary-ops.h"

I/O Contract

Inputs

Parameter	Type	Required	Description
`params`	`const ggml_compute_params *`	Yes	Thread index, thread count, and work buffer for parallel execution.
`dst`	`ggml_tensor *`	Yes	Destination tensor; source tensor is `dst->src[0]`. Must be contiguous along dimension 1. Supported types: f32, f16, bf16.

Outputs

Output	Type	Description
`dst->data`	`void *`	Element-wise result: `dst[i] = op(src0[i])` for each element.

Usage Examples

How Unary Ops Are Dispatched (Internal)

// Inside the compute engine dispatch:
case GGML_OP_RELU:
    ggml_compute_forward_relu(&params, node);
    break;
case GGML_OP_SIGMOID:
    ggml_compute_forward_sigmoid(&params, node);
    break;
case GGML_OP_EXP:
    ggml_compute_forward_exp(&params, node);
    break;

Template Architecture (Internal)

// Each exported function follows this pattern:
void ggml_compute_forward_relu(const ggml_compute_params * params, ggml_tensor * dst) {
    unary_op<op_relu>(params, dst);
}

// unary_op dispatches on type:
template <float (*op)(float)>
static void unary_op(const ggml_compute_params * params, ggml_tensor * dst) {
    if (src0->type == GGML_TYPE_F32 && dst->type == GGML_TYPE_F32) {
        apply_unary_op<op, float, float>(params, dst);
    } else if (src0->type == GGML_TYPE_F16 && dst->type == GGML_TYPE_F16) {
        apply_unary_op<op, ggml_fp16_t, ggml_fp16_t>(params, dst);
    }
    // ... bf16 and mixed-type variants
}

Related Pages

Ggml_org_Ggml_Cpu_tensor_ops -- Higher-level tensor operations that complement these unary ops.
Ggml_org_Ggml_Cpu_vec_api -- Vectorized math functions at the vector level (e.g., ggml_vec_silu_f32).
Ggml_org_Ggml_Cpu_compute_engine -- The graph compute engine that dispatches to these operations.

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment