Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Ggml org Ggml Hexagon softmax ops

From Leeroopedia


Implementation Metadata
File Name src/ggml-hexagon/htp/softmax-ops.c
Repository ggml-org/ggml
Lines 395
Language C
Domain Tags ML_Infrastructure, DSP_Computing, Normalization
Status Active
Last Updated 2025-05-15 12:00 GMT
Knowledge Sources ggml-org/ggml repository

Overview

softmax-ops.c is the DSP-side implementation of the softmax operation on the Hexagon HVX vector processor, with support for attention bias (ALiBi) and optional FP16 output. Softmax is critical in transformer attention layers for normalizing attention scores.

Description

The file uses a softmax_th_ctx struct to hold scale, max_bias, head count, and precomputed ALiBi slopes (m0, m1). The init_softmax_ctx function extracts parameters from op_params and computes n_head_log2 for ALiBi power-of-2 slope computation.

The htp_softmax_preamble3 macro handles both src0 and optional src1 (mask/bias) tensors with null-safe dimension extraction (defaults to 1 when src1 is absent).

The vectorized softmax implementation hvx_fast_softmax_prep_f32 performs:

  1. Scaled input preparation with optional mask and slope
  2. Row-wise max computation
  3. Exponentiation (vectorized exp)
  4. Sum reduction
  5. Normalization (divide by sum)

Multi-threaded execution distributes rows across HVX threads.

Usage

Dispatched from the DSP-side message loop for GGML_OP_SOFT_MAX operations.

Code Reference

Source Location

Repository File Lines
ggml-org/ggml src/ggml-hexagon/htp/softmax-ops.c 395

Key Signatures

struct softmax_th_ctx {
    bool     use_f16;
    bool     use_src1;
    uint32_t n_head;
    uint32_t n_head_log2;
    float    scale;
    float    max_bias;
    float    m0;
    float    m1;
    struct htp_ops_context * octx;
};

static void init_softmax_ctx(struct softmax_th_ctx * softmax_ctx, struct htp_ops_context * octx);

static void hvx_fast_softmax_prep_f32(const uint8_t * restrict src, uint8_t * restrict dst,
    const int num_elems, float scale, const uint8_t * restrict mask, float slope);

I/O Contract

Inputs

  • src0 -- Input logits tensor
  • src1 -- Optional mask/bias tensor (may be FP16 or FP32)
  • op_params -- Contains scale and max_bias parameters

Outputs

  • dst -- Normalized softmax output (probabilities summing to 1.0 per row)

Usage Examples

Softmax with ALiBi support:

// Initialize softmax context with ALiBi slope computation
init_softmax_ctx(&ctx, octx);
// ctx.m0 = pow(2.0, -max_bias / n_head_log2)
// ctx.m1 = pow(2.0, -(max_bias/2.0) / n_head_log2)

// Perform vectorized softmax: scale -> mask -> exp -> sum -> normalize
hvx_fast_softmax_prep_f32(src, dst, num_elems, scale, mask, slope);

Related Pages

Implements Principle

Related Implementations

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment