Implementation:Ggml org Ggml Cann aclnn ops api

Metadata

Field	Value
Page Type	Implementation (API Header)
Knowledge Sources	GGML
Domains	ML_Infrastructure, Tensor_Computing, NPU_Computing
Last Updated	2026-02-10 12:00 GMT

Overview

Header declaring all ACLNN operation wrapper functions for the CANN backend, serving as the complete API surface for Ascend NPU-accelerated tensor operations.

Description

aclnn_ops.h (1,119 lines) declares the ggml_cann_* function prototypes for every tensor operation supported by the CANN backend. Each function is documented with Doxygen comments explaining the mathematical operation, parameter semantics, and constraints.

The declared operations include:

Element-wise: abs, neg, sign, sqrt, exp, log, sin, cos, sigmoid, silu, relu, tanh, gelu, hardsigmoid, hardswish, leaky_relu, elu
Arithmetic: add, sub, mul, div, scale, sqr, pow
Matrix operations: mul_mat, out_prod (via ACLNN batch_matmul, mm, mv, grouped_matmul)
Normalization: norm, rms_norm, group_norm, layer_norm
Attention: flash_attn_ext (mapped to fused_infer_attention_score)
Tensor manipulation: repeat, concat, permute, get_rows, dup, cpy, cont, pad, im2col
Pooling: pool_2d (average and max)
Other: arange, argsort, clamp, softmax, log_softmax, rope, timestep_embedding, conv_transpose_1d, upscale, sum_rows

All functions follow a uniform signature pattern: they take a ggml_backend_cann_context & reference and a ggml_tensor * destination tensor. Source tensors and operation parameters are read from the destination tensor's src[] array and op_params field.

The header also includes helper function declarations like ggml_cann_need_bcast() and the BCAST_SHAPE macro for broadcast dimension computation.

Usage

Include this header when implementing new CANN operations or when extending the CANN backend's operation dispatch table. It serves as the reference for which GGML operations are available on Ascend NPUs.

Code Reference

Source Location

GGML repo, file: src/ggml-cann/aclnn_ops.h, 1119 lines.

Signature

// Representative subset of declared functions
void ggml_cann_repeat(ggml_backend_cann_context & ctx, ggml_tensor * dst);
void ggml_cann_leaky_relu(ggml_backend_cann_context & ctx, ggml_tensor * dst);
void ggml_cann_concat(ggml_backend_cann_context & ctx, ggml_tensor * dst);
void ggml_cann_arange(ggml_backend_cann_context & ctx, ggml_tensor * dst);
void ggml_cann_clamp(ggml_backend_cann_context & ctx, ggml_tensor * dst);
void ggml_cann_scale(ggml_backend_cann_context & ctx, ggml_tensor * dst);
void ggml_cann_argsort(ggml_backend_cann_context & ctx, ggml_tensor * dst);
void ggml_cann_norm(ggml_backend_cann_context & ctx, ggml_tensor * dst);
void ggml_cann_mul_mat(ggml_backend_cann_context & ctx, ggml_tensor * dst);
void ggml_cann_softmax(ggml_backend_cann_context & ctx, ggml_tensor * dst);
void ggml_cann_rope(ggml_backend_cann_context & ctx, ggml_tensor * dst);
void ggml_cann_flash_attn_ext(ggml_backend_cann_context & ctx, ggml_tensor * dst);

Import

#include "aclnn_ops.h"

Dependencies

acl_tensor.h -- ACL tensor smart pointers and creation utilities
common.h -- ggml_backend_cann_context and CANN infrastructure
~25 aclnnop/*.h headers for element-wise and tensor manipulation operators

I/O Contract

Inputs

Parameter	Type	Required	Description
`ctx`	`ggml_backend_cann_context &`	Yes	CANN backend context with device state, stream, and memory pool.
`dst`	`ggml_tensor *`	Yes	Destination tensor carrying source references in `dst->src[]` and operation parameters in `dst->op_params`.

Outputs

Output	Type	Description
`dst->data`	device memory	Operation result written to the destination tensor's device buffer.

Usage Examples

Including the Header in a CANN Source File

#include "aclnn_ops.h"
#include "common.h"

// In the backend dispatch:
void dispatch_op(ggml_backend_cann_context & ctx, ggml_tensor * node) {
    switch (node->op) {
        case GGML_OP_REPEAT:
            ggml_cann_repeat(ctx, node);
            break;
        case GGML_OP_LEAKY_RELU:
            ggml_cann_leaky_relu(ctx, node);
            break;
        case GGML_OP_CONCAT:
            ggml_cann_concat(ctx, node);
            break;
        // ...
    }
}

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment