Implementation:Ggml org Ggml Hexagon backend

**Implementation Metadata**
File Name	`src/ggml-hexagon/ggml-hexagon.cpp`
Repository	ggml-org/ggml
Lines	3187
Language	C++
Domain Tags	ML_Infrastructure, Hardware_Abstraction, DSP_Computing
Status	Active
Last Updated	2025-05-15 12:00 GMT
Knowledge Sources	ggml-org/ggml repository

Overview

ggml-hexagon.cpp is the main host-side implementation of the GGML Hexagon backend, providing the GGML backend interface for offloading tensor operations to Qualcomm Hexagon DSPs. This is the central component connecting GGML's backend abstraction layer to Qualcomm's Hexagon DSP hardware, where all tensor operations are dispatched from the host to the DSP for accelerated inference.

Description

The file defines ggml_hexagon_session which manages DSP communication via FastRPC and dspqueue (message queues). Operations are serialized into htp_general_req messages containing tensor descriptors (shapes, strides, data pointers) and enqueued to the DSP. The implementation supports configurable options for number of devices, HVX threads, architecture version, profiling, and operation masks (queue/quantize/compute stages).

Key architectural features include:

Multi-stage operation pipeline -- controlled by opt_opmask with HTP_OPMASK_QUEUE, HTP_OPMASK_QUANTIZE, and HTP_OPMASK_COMPUTE stages
Shared memory management -- uses rpcmem for allocating memory shared between host and DSP
Debug/profiling infrastructure -- formats tensor dimensions, types, strides, and buffer names via op_desc
Configurable runtime options -- including opt_ndev, opt_nhvx, opt_arch, opt_etm, opt_verbose, opt_profile, and opt_hostbuf

Usage

The Hexagon backend is loaded automatically by ggml_backend_load_all() on systems with Qualcomm Hexagon DSP hardware. It requires the Hexagon SDK runtime libraries (libcdsprpc.so) to be present:

#include "ggml-backend.h"

int main(void) {
    ggml_backend_load_all();
    // Hexagon backend is now registered if DSP hardware is available
    ggml_backend_t backend = ggml_backend_init_best();
    // ...
}

Code Reference

Source Location

Repository	File	Lines
ggml-org/ggml	`src/ggml-hexagon/ggml-hexagon.cpp`	3187

Key Signatures

// Static configuration options
static size_t opt_ndev         = 1;
static size_t opt_nhvx         = 0; // use all
static int    opt_arch         = 0; // autodetect

// Debug helpers
static void ggml_hexagon_dump_op_exec(const std::string &sess_name, const ggml_tensor * op, const uint32_t req_flags);
static void ggml_hexagon_dump_op_supp(const std::string &sess_name, const struct ggml_tensor * op, bool supp);
static void ggml_hexagon_dump_op_prof(const std::string &sess_name, const ggml_tensor * op,
                                      uint32_t op_usec, uint32_t op_cycles, uint32_t op_pkts, uint64_t call_usec);

// Utility functions
static inline uint64_t hex_is_aligned(void * addr, uint32_t align);
static inline size_t hex_round_up(size_t n, size_t m);
static const char * status_to_str(uint32_t status);

I/O Contract

Inputs

Tensor operations -- GGML tensor graph nodes dispatched through the backend interface
Configuration -- Runtime options via environment variables or static configuration (device count, HVX threads, architecture)
Model data -- Tensor data in shared memory buffers allocated via rpcmem

Outputs

Computed tensors -- Results written back to shared memory buffers accessible by the host
Profiling data -- Optional timing information (microseconds, cycles, packets) per operation
Status codes -- HTP_STATUS_OK, HTP_STATUS_NO_SUPPORT, HTP_STATUS_INVAL_PARAMS, HTP_STATUS_VTCM_TOO_SMALL, HTP_STATUS_INTERNAL_ERR

Usage Examples

Backend initialization with Hexagon DSP:

#include "ggml-backend.h"
#include "ggml-hexagon.h"

int main(void) {
    ggml_backend_load_all();

    // The Hexagon backend registers automatically if hardware is detected
    for (size_t i = 0; i < ggml_backend_dev_count(); i++) {
        ggml_backend_dev_t dev = ggml_backend_dev_get(i);
        printf("Device: %s\n", ggml_backend_dev_name(dev));
    }

    return 0;
}

Related Pages

Implements Principle

Principle:Ggml_org_Ggml_Hexagon_DSP_Computation

Related Implementations

Implementation:Ggml_org_Ggml_Hexagon_htp_driver -- Driver loading layer
Implementation:Ggml_org_Ggml_Hexagon_htp_main -- DSP-side message dispatcher
Implementation:Ggml_org_Ggml_Hexagon_matmul_ops -- Matrix multiplication operations
Implementation:Ggml_org_Ggml_Hexagon_flash_attn -- Flash attention operations
Implementation:Ggml_org_Ggml_Backend_impl_interface -- Backend interface contract

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment