Implementation:Ggml org Ggml Hexagon backend
| File Name | src/ggml-hexagon/ggml-hexagon.cpp
|
| Repository | ggml-org/ggml |
| Lines | 3187 |
| Language | C++ |
| Domain Tags | ML_Infrastructure, Hardware_Abstraction, DSP_Computing |
| Status | Active |
| Last Updated | 2025-05-15 12:00 GMT |
| Knowledge Sources | ggml-org/ggml repository |
Overview
ggml-hexagon.cpp is the main host-side implementation of the GGML Hexagon backend, providing the GGML backend interface for offloading tensor operations to Qualcomm Hexagon DSPs. This is the central component connecting GGML's backend abstraction layer to Qualcomm's Hexagon DSP hardware, where all tensor operations are dispatched from the host to the DSP for accelerated inference.
Description
The file defines ggml_hexagon_session which manages DSP communication via FastRPC and dspqueue (message queues). Operations are serialized into htp_general_req messages containing tensor descriptors (shapes, strides, data pointers) and enqueued to the DSP. The implementation supports configurable options for number of devices, HVX threads, architecture version, profiling, and operation masks (queue/quantize/compute stages).
Key architectural features include:
- Multi-stage operation pipeline -- controlled by
opt_opmaskwithHTP_OPMASK_QUEUE,HTP_OPMASK_QUANTIZE, andHTP_OPMASK_COMPUTEstages - Shared memory management -- uses
rpcmemfor allocating memory shared between host and DSP - Debug/profiling infrastructure -- formats tensor dimensions, types, strides, and buffer names via
op_desc - Configurable runtime options -- including
opt_ndev,opt_nhvx,opt_arch,opt_etm,opt_verbose,opt_profile, andopt_hostbuf
Usage
The Hexagon backend is loaded automatically by ggml_backend_load_all() on systems with Qualcomm Hexagon DSP hardware. It requires the Hexagon SDK runtime libraries (libcdsprpc.so) to be present:
#include "ggml-backend.h"
int main(void) {
ggml_backend_load_all();
// Hexagon backend is now registered if DSP hardware is available
ggml_backend_t backend = ggml_backend_init_best();
// ...
}
Code Reference
Source Location
| Repository | File | Lines |
|---|---|---|
| ggml-org/ggml | src/ggml-hexagon/ggml-hexagon.cpp |
3187 |
Key Signatures
// Static configuration options
static size_t opt_ndev = 1;
static size_t opt_nhvx = 0; // use all
static int opt_arch = 0; // autodetect
// Debug helpers
static void ggml_hexagon_dump_op_exec(const std::string &sess_name, const ggml_tensor * op, const uint32_t req_flags);
static void ggml_hexagon_dump_op_supp(const std::string &sess_name, const struct ggml_tensor * op, bool supp);
static void ggml_hexagon_dump_op_prof(const std::string &sess_name, const ggml_tensor * op,
uint32_t op_usec, uint32_t op_cycles, uint32_t op_pkts, uint64_t call_usec);
// Utility functions
static inline uint64_t hex_is_aligned(void * addr, uint32_t align);
static inline size_t hex_round_up(size_t n, size_t m);
static const char * status_to_str(uint32_t status);
I/O Contract
Inputs
- Tensor operations -- GGML tensor graph nodes dispatched through the backend interface
- Configuration -- Runtime options via environment variables or static configuration (device count, HVX threads, architecture)
- Model data -- Tensor data in shared memory buffers allocated via
rpcmem
Outputs
- Computed tensors -- Results written back to shared memory buffers accessible by the host
- Profiling data -- Optional timing information (microseconds, cycles, packets) per operation
- Status codes --
HTP_STATUS_OK,HTP_STATUS_NO_SUPPORT,HTP_STATUS_INVAL_PARAMS,HTP_STATUS_VTCM_TOO_SMALL,HTP_STATUS_INTERNAL_ERR
Usage Examples
Backend initialization with Hexagon DSP:
#include "ggml-backend.h"
#include "ggml-hexagon.h"
int main(void) {
ggml_backend_load_all();
// The Hexagon backend registers automatically if hardware is detected
for (size_t i = 0; i < ggml_backend_dev_count(); i++) {
ggml_backend_dev_t dev = ggml_backend_dev_get(i);
printf("Device: %s\n", ggml_backend_dev_name(dev));
}
return 0;
}
Related Pages
Implements Principle
Related Implementations
- Implementation:Ggml_org_Ggml_Hexagon_htp_driver -- Driver loading layer
- Implementation:Ggml_org_Ggml_Hexagon_htp_main -- DSP-side message dispatcher
- Implementation:Ggml_org_Ggml_Hexagon_matmul_ops -- Matrix multiplication operations
- Implementation:Ggml_org_Ggml_Hexagon_flash_attn -- Flash attention operations
- Implementation:Ggml_org_Ggml_Backend_impl_interface -- Backend interface contract