Heuristic:Alibaba MNN Backend Selection Guide

Knowledge Sources	Alibaba MNN MNNForwardType.h llm.cpp
Domains	Optimization, Inference, Deployment
Last Updated	2026-02-10 12:00 GMT

Overview

Decision framework for selecting the optimal compute backend (CPU, OpenCL, Metal, CUDA, Vulkan, NPU) based on target platform and workload.

Description

MNN supports multiple hardware backends. The right choice depends on the target platform (Android/iOS/Linux/Windows), available hardware (GPU type, NPU), and model characteristics. MNN_FORWARD_AUTO tries to find the best non-CPU backend automatically.

Usage

Use when deploying models to a new platform or when needing to optimize inference latency.

The Insight (Rule of Thumb)

Android: Use MNN_FORWARD_OPENCL for GPU (Qualcomm/Mali). Use MNN_FORWARD_NN for NNAPI/QNN NPU on Snapdragon.
iOS/macOS: Use MNN_FORWARD_METAL for GPU. Use MNN_FORWARD_NN for CoreML/ANE.
Linux desktop: Use MNN_FORWARD_CUDA for NVIDIA GPU. Use MNN_FORWARD_OPENCL for AMD/Intel GPU.
Windows: Use MNN_FORWARD_CUDA for NVIDIA. MNN_SEP_BUILD is forced OFF on Windows.
Auto mode: MNN_FORWARD_AUTO selects the first available non-CPU backend, falls back to CPU.
Trade-off: GPU backends have initialization overhead (kernel compilation, tuning) but faster sustained inference. CPU is always available with zero init cost.

Reasoning

Different platforms have different GPU APIs available. Metal is Apple-only. CUDA is NVIDIA-only. OpenCL is the most portable GPU API but may have driver quirks. NPU backends (QNN, CoreML, NNAPI) offer the best power efficiency on mobile.

Code Evidence

Backend enum definitions from `MNNForwardType.h:14-59`:

MNN_FORWARD_CPU = 0,
MNN_FORWARD_OPENCL = 3,
MNN_FORWARD_METAL = 1,
MNN_FORWARD_CUDA = 2,
MNN_FORWARD_VULKAN = 7,
MNN_FORWARD_NN = 6,
MNN_FORWARD_AUTO = 4,

Backend type conversion from `llm.cpp:49-64`:

static MNNForwardType backend_type_convert(const std::string& type_str) {
    if (type_str == "cpu") return MNN_FORWARD_CPU;
    if (type_str == "metal") return MNN_FORWARD_METAL;
    if (type_str == "cuda") return MNN_FORWARD_CUDA;
    if (type_str == "opencl") return MNN_FORWARD_OPENCL;
    if (type_str == "vulkan") return MNN_FORWARD_VULKAN;
    if (type_str == "npu") return MNN_FORWARD_NN;
    return MNN_FORWARD_AUTO;
}

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment