Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Ggml org Ggml Cpu spacemit ime

From Leeroopedia


Metadata

Field Value
Page Type Implementation (SpacemiT IME Backend)
Knowledge Sources GGML
Domains ML_Infrastructure, Tensor_Computing, CPU_Backend, Quantized_Matrix_Multiplication
Last Updated 2025-05-15 12:00 GMT

Overview

Implements the GGML backend integration for SpacemiT IME (Inference Matrix Engine) on RISC-V processors, providing hardware-accelerated quantized matrix multiplication.

Description

spacemit/ime.cpp enables hardware-accelerated quantized inference on SpacemiT RISC-V processors (e.g., SpacemiT K1/X60) that include a dedicated Inference Matrix Engine. Key components include:

  1. Build requirements: Requires RISC-V V extension (__riscv_v), Zfh extension (__riscv_zfh), and the RISCV64_SPACEMIT_IME1 build flag. Compilation fails with descriptive errors if requirements are unmet.
  2. GEMM arguments: qnbitgemm_spacemit_ime_args carries matrix pointers (a_ptr, packed_quant_b_data), strides (lda, ldc), quantization scales and zero points, bias, and output buffer.
  3. Int8-by-Int4 GEMM: sqnbitgemm_spacemit_ime_i8i4 performs the int8 activation by int4 weight quantized GEMM with block-wise scaling. It dispatches to sqnbitgemm_spacemit_ime::ime1::gemm_kernel_i8i4 for the hardware-accelerated inner kernel.
  4. Weight packing: Interleaved block layout for efficient IME access, with block<K, N> template structs storing N quantization blocks with grouped deltas and packed quants.
  5. Tensor traits: Implements custom tensor_traits_base and tensor_traits_common for work size calculation and compute dispatch.
  6. Extra buffer type: Provides ggml_backend_cpu_riscv64_spacemit_buffer_type() for registering SpacemiT IME as an accelerated backend.
  7. AI core detection: Uses std::thread::hardware_concurrency() / 2 to estimate the number of AI cores available.

Usage

SpacemiT IME acceleration is activated automatically on supported RISC-V hardware when the build includes GGML_USE_CPU_RISCV64_SPACEMIT. The backend registers itself as an extra buffer type.

Code Reference

Source Location

GGML repo, file: src/ggml-cpu/spacemit/ime.cpp (1025 lines).

Signature

// Backend buffer type registration
ggml_backend_buffer_type_t ggml_backend_cpu_riscv64_spacemit_buffer_type(void);

// Internal GEMM function
static void sqnbitgemm_spacemit_ime_i8i4(
    const size_t blk_len,
    const size_t gemm_k,
    const qnbitgemm_spacemit_ime_args * gemm_args,
    void * const per_gemm_ws,
    const size_t m_start, const size_t m_count,
    const size_t n_start, const size_t n_count);

Import

#include "spacemit/ime.h"

I/O Contract

Inputs

Parameter Type Required Description
blk_len size_t Yes Quantization block length (e.g., 32 for q4_0).
gemm_k size_t Yes Inner dimension of the matrix multiplication.
gemm_args const qnbitgemm_spacemit_ime_args * Yes Matrix pointers, strides, quantization parameters.
per_gemm_ws void * Yes Per-GEMM workspace buffer for quantized activations.
m_start, m_count, n_start, n_count size_t Yes Tile coordinates for the current thread's work partition.

Outputs

Output Type Description
gemm_args->c_ptr float * Matrix multiplication result in f32 format.
Buffer type ggml_backend_buffer_type_t SpacemiT buffer type, or NULL if hardware is unsupported.

Usage Examples

Automatic SpacemiT IME Activation

#include "ggml-cpu.h"

// SpacemiT IME is automatically enabled when building with
// GGML_USE_CPU_RISCV64_SPACEMIT and running on SpacemiT K1/X60 hardware.

// The CPU backend auto-registers the SpacemiT buffer type:
ggml_backend_t cpu = ggml_backend_cpu_init();

// Quantized tensors (q4_0, q8_0) will automatically use IME-accelerated
// matrix multiplication when weights are allocated through the SpacemiT buffer.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment