Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Ggml org Ggml Hexagon backend

From Leeroopedia


Implementation Metadata
File Name src/ggml-hexagon/ggml-hexagon.cpp
Repository ggml-org/ggml
Lines 3187
Language C++
Domain Tags ML_Infrastructure, Hardware_Abstraction, DSP_Computing
Status Active
Last Updated 2025-05-15 12:00 GMT
Knowledge Sources ggml-org/ggml repository

Overview

ggml-hexagon.cpp is the main host-side implementation of the GGML Hexagon backend, providing the GGML backend interface for offloading tensor operations to Qualcomm Hexagon DSPs. This is the central component connecting GGML's backend abstraction layer to Qualcomm's Hexagon DSP hardware, where all tensor operations are dispatched from the host to the DSP for accelerated inference.

Description

The file defines ggml_hexagon_session which manages DSP communication via FastRPC and dspqueue (message queues). Operations are serialized into htp_general_req messages containing tensor descriptors (shapes, strides, data pointers) and enqueued to the DSP. The implementation supports configurable options for number of devices, HVX threads, architecture version, profiling, and operation masks (queue/quantize/compute stages).

Key architectural features include:

  • Multi-stage operation pipeline -- controlled by opt_opmask with HTP_OPMASK_QUEUE, HTP_OPMASK_QUANTIZE, and HTP_OPMASK_COMPUTE stages
  • Shared memory management -- uses rpcmem for allocating memory shared between host and DSP
  • Debug/profiling infrastructure -- formats tensor dimensions, types, strides, and buffer names via op_desc
  • Configurable runtime options -- including opt_ndev, opt_nhvx, opt_arch, opt_etm, opt_verbose, opt_profile, and opt_hostbuf

Usage

The Hexagon backend is loaded automatically by ggml_backend_load_all() on systems with Qualcomm Hexagon DSP hardware. It requires the Hexagon SDK runtime libraries (libcdsprpc.so) to be present:

#include "ggml-backend.h"

int main(void) {
    ggml_backend_load_all();
    // Hexagon backend is now registered if DSP hardware is available
    ggml_backend_t backend = ggml_backend_init_best();
    // ...
}

Code Reference

Source Location

Repository File Lines
ggml-org/ggml src/ggml-hexagon/ggml-hexagon.cpp 3187

Key Signatures

// Static configuration options
static size_t opt_ndev         = 1;
static size_t opt_nhvx         = 0; // use all
static int    opt_arch         = 0; // autodetect

// Debug helpers
static void ggml_hexagon_dump_op_exec(const std::string &sess_name, const ggml_tensor * op, const uint32_t req_flags);
static void ggml_hexagon_dump_op_supp(const std::string &sess_name, const struct ggml_tensor * op, bool supp);
static void ggml_hexagon_dump_op_prof(const std::string &sess_name, const ggml_tensor * op,
                                      uint32_t op_usec, uint32_t op_cycles, uint32_t op_pkts, uint64_t call_usec);

// Utility functions
static inline uint64_t hex_is_aligned(void * addr, uint32_t align);
static inline size_t hex_round_up(size_t n, size_t m);
static const char * status_to_str(uint32_t status);

I/O Contract

Inputs

  • Tensor operations -- GGML tensor graph nodes dispatched through the backend interface
  • Configuration -- Runtime options via environment variables or static configuration (device count, HVX threads, architecture)
  • Model data -- Tensor data in shared memory buffers allocated via rpcmem

Outputs

  • Computed tensors -- Results written back to shared memory buffers accessible by the host
  • Profiling data -- Optional timing information (microseconds, cycles, packets) per operation
  • Status codes -- HTP_STATUS_OK, HTP_STATUS_NO_SUPPORT, HTP_STATUS_INVAL_PARAMS, HTP_STATUS_VTCM_TOO_SMALL, HTP_STATUS_INTERNAL_ERR

Usage Examples

Backend initialization with Hexagon DSP:

#include "ggml-backend.h"
#include "ggml-hexagon.h"

int main(void) {
    ggml_backend_load_all();

    // The Hexagon backend registers automatically if hardware is detected
    for (size_t i = 0; i < ggml_backend_dev_count(); i++) {
        ggml_backend_dev_t dev = ggml_backend_dev_get(i);
        printf("Device: %s\n", ggml_backend_dev_name(dev));
    }

    return 0;
}

Related Pages

Implements Principle

Related Implementations

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment