Implementation:Ggml org Ggml Cann backend api

Metadata

Field	Value
Page Type	Implementation (API Header)
Knowledge Sources	GGML
Domains	ML_Infrastructure, Tensor_Computing, NPU_Computing
Last Updated	2026-02-10 12:00 GMT

Overview

Public header declaring the CANN (Compute Architecture for Neural Networks) backend interface for running tensor operations on Huawei Ascend NPU devices.

Description

ggml-cann.h (123 lines) provides the complete public API for the CANN backend. It supports up to 16 Ascend NPU devices (GGML_CANN_MAX_DEVICES) and exposes:

Backend lifecycle: ggml_backend_cann_init(device) initializes a backend for a specific device; ggml_backend_is_cann() identifies CANN backends; ggml_backend_cann_reg() returns the registration handle
Device enumeration: ggml_backend_cann_get_device_count() returns the number of available Ascend devices
Device information: ggml_backend_cann_get_device_description() retrieves the SoC name; ggml_backend_cann_get_device_memory() queries free and total HBM memory
Buffer types: ggml_backend_cann_buffer_type(device) returns the device buffer type; ggml_backend_cann_host_buffer_type() returns a pinned host buffer type for faster CPU-NPU transfers

All functions are marked with GGML_BACKEND_API and enclosed in extern "C" for C/C++ compatibility.

Usage

Include this header in application code to initialize CANN backends, query Ascend device capabilities, and manage device/host memory buffers.

Code Reference

Source Location

GGML repo, file: include/ggml-cann.h, 123 lines.

Signature

#define GGML_CANN_MAX_DEVICES 16

GGML_BACKEND_API ggml_backend_reg_t ggml_backend_cann_reg(void);
GGML_BACKEND_API ggml_backend_t ggml_backend_cann_init(int32_t device);
GGML_BACKEND_API bool ggml_backend_is_cann(ggml_backend_t backend);
GGML_BACKEND_API ggml_backend_buffer_type_t ggml_backend_cann_buffer_type(int32_t device);
GGML_BACKEND_API int32_t ggml_backend_cann_get_device_count(void);
GGML_BACKEND_API ggml_backend_buffer_type_t ggml_backend_cann_host_buffer_type(void);
GGML_BACKEND_API void ggml_backend_cann_get_device_description(
    int32_t device, char * description, size_t description_size);
GGML_BACKEND_API void ggml_backend_cann_get_device_memory(
    int32_t device, size_t * free, size_t * total);

Import

#include "ggml-cann.h"

Dependencies

ggml.h -- core GGML types
ggml-backend.h -- backend abstraction types

I/O Contract

Inputs

Parameter	Type	Required	Description
`device`	`int32_t`	Yes	Ascend NPU device index (0 to `GGML_CANN_MAX_DEVICES - 1`).
`backend`	`ggml_backend_t`	Yes (for is_cann)	Backend instance to verify.
`description`	`char *`	Yes (for get_description)	Buffer to receive the device SoC name string.
`description_size`	`size_t`	Yes (for get_description)	Size of the description buffer.
`free`	`size_t *`	Yes (for get_memory)	Pointer to store free HBM memory in bytes.
`total`	`size_t *`	Yes (for get_memory)	Pointer to store total HBM memory in bytes.

Outputs

Output	Type	Description
Backend handle	`ggml_backend_t`	Initialized CANN backend for the specified device, or nullptr on failure.
Device count	`int32_t`	Number of available Ascend NPU devices.
Buffer type	`ggml_backend_buffer_type_t`	Device or host buffer type interface.
Is CANN	`bool`	True if the provided backend is a CANN backend.

Usage Examples

Querying and Initializing CANN Devices

#include "ggml-cann.h"

int n = ggml_backend_cann_get_device_count();
for (int i = 0; i < n; i++) {
    char desc[256];
    ggml_backend_cann_get_device_description(i, desc, sizeof(desc));

    size_t free, total;
    ggml_backend_cann_get_device_memory(i, &free, &total);

    printf("Device %d: %s (%.1f GB free / %.1f GB total)\n",
           i, desc, free / 1e9, total / 1e9);
}

ggml_backend_t cann = ggml_backend_cann_init(0);

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment