Environment:Alibaba MNN GPU OpenCL Environment
| Field | Value |
|---|---|
| environment_name | GPU_OpenCL_Environment |
| environment_type | GPU Acceleration |
| repository | Alibaba_MNN |
| platform | Android, Linux |
| source_file | CMakeLists.txt (L39, L256), include/MNN/MNNForwardType.h (L61-79) |
| last_updated | 2026-02-10 14:00 GMT |
Overview
OpenCL GPU acceleration environment for MNN on Android and Linux devices. This environment enables GPU-accelerated inference using the OpenCL API, supporting a wide range of mobile and desktop GPUs including Qualcomm Adreno, ARM Mali, Intel, and AMD. It provides fine-grained control over GPU tuning levels, memory modes (buffer vs. image), and kernel recording strategies.
Description
The OpenCL backend (MNN_FORWARD_OPENCL) is MNN's most portable GPU acceleration path. It supports two memory modes (buffer and image), five tuning levels, and kernel recording modes optimized for Qualcomm GPUs. OpenCL headers are bundled in the MNN source tree at 3rd_party/OpenCLHeaders/, so no external OpenCL SDK installation is required for compilation. At runtime, OpenCL drivers must be present on the target device.
The backend can load the OpenCL library either via system linking (MNN_USE_SYSTEM_LIB=ON) or via dynamic loading (dlopen) at runtime, which is the default behavior.
Usage
Use this environment when deploying MNN models on Android devices with GPU support (Qualcomm Adreno, ARM Mali) or on Linux desktops with AMD or Intel GPUs that provide OpenCL drivers. This is the recommended GPU backend for Android deployments.
System Requirements
- Operating System: Android (primary), Linux (desktop GPU)
- GPU: OpenCL-capable GPU with OpenCL 1.2 or later support
- Qualcomm Adreno (Android)
- ARM Mali (Android)
- AMD Radeon (Linux)
- Intel HD/UHD/Iris (Linux)
- OpenCL Drivers: Must be installed on the target device (typically bundled with GPU vendor drivers)
- CMake: Version 3.6 or later
- Android NDK: Required for Android cross-compilation
Dependencies
| Dependency | Required | Notes |
|---|---|---|
| OpenCL headers | Yes | Bundled in 3rd_party/OpenCLHeaders/; no external SDK needed for compilation
|
| OpenCL runtime library | Yes | Must be present on the target device at runtime; loaded via dlopen by default
|
| CMake | Yes | Build system; must be 3.6 or later |
| Android NDK | Conditional | Required only for Android cross-compilation targets |
Credentials
No credentials, API keys, or tokens are required for this environment. All software is locally installed or bundled.
Quick Install
# 1. Clone MNN repository
git clone https://github.com/alibaba/MNN.git
cd MNN
# --- Linux Desktop Build ---
mkdir build && cd build
cmake .. \
-DMNN_OPENCL=ON \
-DCMAKE_BUILD_TYPE=Release
make -j$(nproc)
# --- Android arm64-v8a Cross-Compilation ---
mkdir build_android && cd build_android
cmake .. \
-DCMAKE_TOOLCHAIN_FILE=$ANDROID_NDK/build/cmake/android.toolchain.cmake \
-DCMAKE_BUILD_TYPE=Release \
-DANDROID_ABI="arm64-v8a" \
-DANDROID_STL=c++_static \
-DANDROID_NATIVE_API_LEVEL=android-21 \
-DANDROID_TOOLCHAIN=clang \
-DMNN_OPENCL=ON \
-DMNN_BUILD_LLM=ON \
-DMNN_LOW_MEMORY=ON \
-DMNN_SEP_BUILD=OFF
make -j$(nproc)
# --- Linux Desktop Build with System Lib ---
mkdir build_syslib && cd build_syslib
cmake .. \
-DMNN_OPENCL=ON \
-DMNN_USE_SYSTEM_LIB=ON \
-DCMAKE_BUILD_TYPE=Release
make -j$(nproc)
Code Evidence
CMakeLists.txt (Line 256): OpenCL option
option(MNN_OPENCL "Enable OpenCL" OFF)
OpenCL is disabled by default and must be explicitly enabled with -DMNN_OPENCL=ON.
CMakeLists.txt (Line 39): System library option
option(MNN_USE_SYSTEM_LIB "For opencl and vulkan, use system lib or use dlopen" OFF)
By default, MNN dynamically loads the OpenCL library at runtime via dlopen. Setting MNN_USE_SYSTEM_LIB=ON links against the system-installed OpenCL library instead.
CMakeLists.txt (Lines 448, 461): Bundled OpenCL headers
${CMAKE_CURRENT_LIST_DIR}/3rd_party/OpenCLHeaders/
OpenCL headers are included in the MNN source tree, eliminating the need for an external OpenCL SDK at build time.
CMakeLists.txt (Lines 715-720): OpenCL backend activation
IF(MNN_OPENCL)
...
add_definitions(-DMNN_OPENCL_ENABLED=1)
When OpenCL is enabled, the MNN_OPENCL_ENABLED preprocessor definition is set.
MNNForwardType.h (Lines 61-78): GPU tuning and memory modes
typedef enum {
MNN_GPU_TUNING_NONE = 1 << 0, /* Forbidden tuning, performance not good.(OpenCL/Vulkan) */
MNN_GPU_TUNING_HEAVY = 1 << 1, /* Heavily tuning, usually not suggested.(OpenCL/Vulkan) */
MNN_GPU_TUNING_WIDE = 1 << 2, /* Widely tuning, performance good. Default.(OpenCL/Vulkan) */
MNN_GPU_TUNING_NORMAL = 1 << 3, /* Normal tuning, performance may be ok.(OpenCL) */
MNN_GPU_TUNING_FAST = 1 << 4, /* Fast tuning, performance may not good.(OpenCL) */
MNN_GPU_MEMORY_BUFFER = 1 << 6, /* OpenCL_MEMORY_BUFFER */
MNN_GPU_MEMORY_IMAGE = 1 << 7, /* OpenCL_MEMORY_IMAGE */
MNN_GPU_RECORD_OP = 1 << 8, /* The kernels in one op execution record into one recording.(OpenCL) */
MNN_GPU_RECORD_BATCH = 1 << 9, /* 10 kernels record into one recording.(OpenCL) */
} MNNGpuMode;
This enum defines the following configuration axes for the OpenCL backend:
GPU Tuning Levels (five levels):
MNN_GPU_TUNING_NONE(1) -- No tuning; fastest init, worst runtime performanceMNN_GPU_TUNING_HEAVY(2) -- Exhaustive tuning; long init, best runtime performanceMNN_GPU_TUNING_WIDE(4) -- Wide tuning; good balance (default)MNN_GPU_TUNING_NORMAL(8) -- Normal tuning; moderate init time (OpenCL only)MNN_GPU_TUNING_FAST(16) -- Fast tuning; quick init, variable runtime (OpenCL only)
Memory Modes (two modes):
MNN_GPU_MEMORY_BUFFER(64) -- Use OpenCL buffer objects for tensor storageMNN_GPU_MEMORY_IMAGE(128) -- Use OpenCL image objects for tensor storage (may be faster on some GPUs)
Record Modes (two modes, Qualcomm-optimized):
MNN_GPU_RECORD_OP(256) -- Record kernels per-op into one recordingMNN_GPU_RECORD_BATCH(512) -- Record 10 kernels into one recording
Common Errors
| Error | Cause | Resolution |
|---|---|---|
OpenCL library not found or dlopen failed for libOpenCL.so |
OpenCL runtime not installed on the target device | Install GPU vendor drivers that include OpenCL support; on Android, this is typically bundled with the system image |
CL_DEVICE_NOT_FOUND |
No OpenCL-capable GPU detected by the driver | Verify GPU hardware supports OpenCL; check driver installation with clinfo (Linux)
|
| Slow first inference run | OpenCL kernel compilation and tuning occurring on first execution | This is expected behavior. The tuning results are cached; subsequent runs will be faster. Use MNN_GPU_TUNING_WIDE (default) for good first-run performance
|
CL_OUT_OF_RESOURCES |
GPU memory exhausted or kernel too large for the device | Reduce model size, lower precision, or try MNN_GPU_MEMORY_BUFFER mode instead of image mode
|
| Poor performance with record mode | Using MNN_GPU_RECORD_OP or MNN_GPU_RECORD_BATCH on non-Qualcomm GPU |
Record modes are effective only on Qualcomm GPUs; disable them on other hardware |
Compatibility Notes
- Qualcomm GPU optimizations: The
MNN_GPU_RECORD_OPandMNN_GPU_RECORD_BATCHmodes are effective only on Qualcomm Adreno GPUs. On other GPU vendors, these modes should not be used as they may degrade performance. - Memory mode selection: Users should benchmark both
MNN_GPU_MEMORY_BUFFERandMNN_GPU_MEMORY_IMAGEon their target device and select the better-performing mode. Performance varies by GPU vendor and model architecture. - Tuning levels:
MNN_GPU_TUNING_NORMALandMNN_GPU_TUNING_FASTare OpenCL-specific and are not available for the Vulkan backend. - Library loading: By default, MNN uses
dlopento load the OpenCL library at runtime. SetMNN_USE_SYSTEM_LIB=ONto link against the system OpenCL library at build time instead. - Time profiling: GPU time profiling can be enabled with
MNN_GPU_TIME_PROFILE=ON(CMakeLists.txt:274) for both OpenCL and Vulkan backends. - Android: This is the recommended GPU backend for Android devices. Ensure
MNN_SEP_BUILD=OFFfor Android builds.