Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Ggml org Ggml Zendnn backend

From Leeroopedia


Implementation Metadata
File Name src/ggml-zendnn/ggml-zendnn.cpp
Repository ggml-org/ggml
Lines 469
Language C++
Domain Tags ML_Infrastructure, Hardware_Abstraction, AMD_CPU
Status Active
Last Updated 2025-05-15 12:00 GMT
Knowledge Sources ggml-org/ggml repository

Overview

ggml-zendnn.cpp implements the ZenDNN backend for optimized matrix multiplication on AMD Zen CPUs via the ZenDNN library's low-overhead hardware abstraction. This focused backend (469 lines) accelerates the dominant matrix multiplication workload on AMD Zen processors using the "lowoha" (Low-Overhead Hardware Abstraction) path.

Description

The ggml_backend_zendnn_context holds thread count and a work buffer. A template function ggml_to_zendnn_type maps C++ types (float, ggml_bf16_t) to zendnnl::common::data_type_t.

The core ggml_zendnn_matmul template function uses zendnnl::lowoha::matmul_direct with:

  • Row-major layout
  • Transposed weights (column-major to row-major via true transpose flag)
  • alpha=1.0, beta=0.0
  • is_weights_const=true for weight transformation caching across calls

The ggml_zendnn_sgemm dispatcher selects type-specific instantiations:

  • F32 x F32 -> F32
  • BF16 x BF16 -> BF16
  • BF16 x BF16 -> F32

Usage

#include "ggml-backend.h"

int main(void) {
    ggml_backend_load_all();
    // ZenDNN backend registers on AMD Zen systems with ZenDNN installed
    ggml_backend_t backend = ggml_backend_init_best();
    // ...
}

Code Reference

Source Location

Repository File Lines
ggml-org/ggml src/ggml-zendnn/ggml-zendnn.cpp 469

Key Signatures

struct ggml_backend_zendnn_context {
    int n_threads = GGML_DEFAULT_N_THREADS;
    std::unique_ptr<char[]> work_data;
    size_t work_size = 0;
};

template<typename T>
zendnnl::common::data_type_t ggml_to_zendnn_type();

template <typename TA, typename TB, typename TC>
static bool ggml_zendnn_matmul(ggml_backend_zendnn_context * ctx,
    int64_t m, int64_t n, int64_t k,
    const TA * A, int64_t lda,
    const TB * B, int64_t ldb,
    TC * C, int64_t ldc);

static bool ggml_zendnn_sgemm(ggml_backend_zendnn_context * ctx,
    int64_t m, int64_t n, int64_t k,
    const void * A, int64_t lda,
    const void * B, int64_t ldb,
    void * C, int64_t ldc,
    int Atype, int Btype, int Ctype);

I/O Contract

Inputs

  • A (weights) -- Weight matrix, shape (k, m), column-major
  • B (input) -- Input matrix, shape (n, k), row-major
  • Type parameters -- F32 or BF16 for each matrix

Outputs

  • C (output) -- Result matrix, shape (n, m), row-major
  • Boolean status -- true on success, false on failure

Usage Examples

Matrix multiplication with ZenDNN:

// ZenDNN computes C = B * A where:
//   A: weights [k, m] column-major
//   B: input   [n, k] row-major
//   C: output  [n, m] row-major
//
// The lowoha path provides:
// - Direct matmul execution with minimal API overhead
// - Weight transformation caching via is_weights_const=true
// - Multi-threaded execution via ctx->n_threads

Related Pages

Implements Principle

Related Implementations

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment