Heuristic:Alibaba MNN Backend Selection Guide
| Knowledge Sources | |
|---|---|
| Domains | Optimization, Inference, Deployment |
| Last Updated | 2026-02-10 12:00 GMT |
Overview
Decision framework for selecting the optimal compute backend (CPU, OpenCL, Metal, CUDA, Vulkan, NPU) based on target platform and workload.
Description
MNN supports multiple hardware backends. The right choice depends on the target platform (Android/iOS/Linux/Windows), available hardware (GPU type, NPU), and model characteristics. MNN_FORWARD_AUTO tries to find the best non-CPU backend automatically.
Usage
Use when deploying models to a new platform or when needing to optimize inference latency.
The Insight (Rule of Thumb)
- Android: Use MNN_FORWARD_OPENCL for GPU (Qualcomm/Mali). Use MNN_FORWARD_NN for NNAPI/QNN NPU on Snapdragon.
- iOS/macOS: Use MNN_FORWARD_METAL for GPU. Use MNN_FORWARD_NN for CoreML/ANE.
- Linux desktop: Use MNN_FORWARD_CUDA for NVIDIA GPU. Use MNN_FORWARD_OPENCL for AMD/Intel GPU.
- Windows: Use MNN_FORWARD_CUDA for NVIDIA. MNN_SEP_BUILD is forced OFF on Windows.
- Auto mode: MNN_FORWARD_AUTO selects the first available non-CPU backend, falls back to CPU.
- Trade-off: GPU backends have initialization overhead (kernel compilation, tuning) but faster sustained inference. CPU is always available with zero init cost.
Reasoning
Different platforms have different GPU APIs available. Metal is Apple-only. CUDA is NVIDIA-only. OpenCL is the most portable GPU API but may have driver quirks. NPU backends (QNN, CoreML, NNAPI) offer the best power efficiency on mobile.
Code Evidence
Backend enum definitions from `MNNForwardType.h:14-59`:
MNN_FORWARD_CPU = 0,
MNN_FORWARD_OPENCL = 3,
MNN_FORWARD_METAL = 1,
MNN_FORWARD_CUDA = 2,
MNN_FORWARD_VULKAN = 7,
MNN_FORWARD_NN = 6,
MNN_FORWARD_AUTO = 4,
Backend type conversion from `llm.cpp:49-64`:
static MNNForwardType backend_type_convert(const std::string& type_str) {
if (type_str == "cpu") return MNN_FORWARD_CPU;
if (type_str == "metal") return MNN_FORWARD_METAL;
if (type_str == "cuda") return MNN_FORWARD_CUDA;
if (type_str == "opencl") return MNN_FORWARD_OPENCL;
if (type_str == "vulkan") return MNN_FORWARD_VULKAN;
if (type_str == "npu") return MNN_FORWARD_NN;
return MNN_FORWARD_AUTO;
}