Implementation:Tencent Ncnn Ruapu ISA Detection
| Knowledge Sources | |
|---|---|
| Domains | CPU Architecture, ISA Detection, Cross Platform |
| Last Updated | 2026-02-09 19:00 GMT |
Overview
Single-file, header-only C library for runtime CPU instruction set architecture (ISA) feature detection by directly probing whether specific machine-code instructions execute successfully.
Description
ruapu is a header-only library that determines which SIMD and specialized instruction sets are supported by the current processor at runtime. Unlike traditional approaches that rely on OS-provided feature flags (e.g., /proc/cpuinfo, getauxval), ruapu works by directly executing raw instruction opcodes and catching the resulting illegal-instruction signal or exception if the CPU does not support them.
The detection mechanism is platform-specific:
- On Windows (MSVC), it uses structured exception handling (
__try/__except) to catchEXCEPTION_ILLEGAL_INSTRUCTION - On Windows (other compilers), it uses
AddVectoredExceptionHandlerwithsetjmp/longjmp - On Linux, Android, macOS, BSD, and Solaris, it installs a
SIGILL/SIGSEGVsignal handler viasigactionand usessigsetjmp/siglongjmpto recover - On bare-metal systems (e.g., SyterKit), it patches the undefined instruction handler
The RUAPU_INSTCODE macro places raw instruction bytes directly into the .text section (on Windows) or uses inline assembly (on Unix), creating callable function stubs for each ISA feature. Each stub contains the minimum instruction(s) needed to test a feature followed by a return instruction.
A static table (g_ruapu_isa_map) maps ISA name strings to their corresponding probe functions. During ruapu_init(), the library iterates through all entries, calls each probe function, and records which ones succeed without triggering an exception into a g_ruapu_isa_supported array.
Supported architectures and instruction sets:
- x86/x86_64: MMX, SSE through SSE4.2, SSE4a, XOP, AVX, F16C, FMA, FMA4, AVX2, AVX-512 (F, BW, CD, DQ, VL, VNNI, BF16, IFMA, VBMI, VBMI2, FP16, ER), AVX-VNNI, AMX, AES-NI, SHA, and more
- AArch64: NEON, SVE, SVE2, SME, dot product, half precision, BF16, I8MM, and more
- ARM 32-bit: NEON, VFPv4, EDSP
- RISC-V: V extension, various Z extensions, T-Head vendor extensions
- MIPS: MSA
- LoongArch: LSX, LASX
- PowerPC: VSX
- s390x: zvector
- OpenRISC: Feature register probing
Usage
Use ruapu when you need reliable runtime detection of CPU ISA features, especially on platforms where OS-provided feature flags may be incomplete or unavailable (e.g., Windows ARM, bare-metal systems, older Linux kernels). Call ruapu_init() once at startup, then query any feature by name with ruapu_supports(). The ncnn CPU detection module (cpu.cpp) uses ruapu as one of its detection backends.
Code Reference
Source Location
- Repository: Tencent_Ncnn
- File: src/ruapu.h (674 lines)
- Upstream project: github.com/nihui/ruapu
Signature
#ifdef __cplusplus
extern "C" {
#endif
/* Initialize ruapu: probes all ISA features on the current CPU.
* Must be called once before ruapu_supports() or ruapu_rua(). */
void ruapu_init();
/* Check if a specific ISA is supported.
* Returns 1 if supported, 0 if not.
* The isa parameter is a string like "avx2", "neon", "sve", etc. */
int ruapu_supports(const char* isa);
/* Get a null-terminated array of all supported ISA name strings. */
const char* const* ruapu_rua();
#ifdef __cplusplus
}
#endif
Import
/* Define RUAPU_IMPLEMENTATION in exactly one translation unit */
#define RUAPU_IMPLEMENTATION
#include "ruapu.h"
/* In other files, just include the header */
#include "ruapu.h"
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| isa | const char* |
Yes (for ruapu_supports) | ISA feature name string, e.g., "avx2", "neon", "sve", "sse42"
|
Outputs
| Name | Type | Description |
|---|---|---|
| supported | int |
1 if the ISA is supported on the current CPU, 0 otherwise (from ruapu_supports)
|
| isa_list | const char* const* |
Null-terminated array of supported ISA name strings (from ruapu_rua)
|
Usage Examples
Basic ISA Detection
#define RUAPU_IMPLEMENTATION
#include "ruapu.h"
#include <stdio.h>
int main()
{
ruapu_init();
if (ruapu_supports("avx2"))
printf("AVX2 is supported\n");
if (ruapu_supports("neon"))
printf("NEON is supported\n");
if (ruapu_supports("sse42"))
printf("SSE4.2 is supported\n");
return 0;
}
Enumerating All Supported ISAs
#define RUAPU_IMPLEMENTATION
#include "ruapu.h"
#include <stdio.h>
int main()
{
ruapu_init();
const char* const* supported = ruapu_rua();
printf("Supported ISA features:\n");
while (*supported)
{
printf(" %s\n", *supported);
supported++;
}
return 0;
}
Selecting Optimized Code Path
#include "ruapu.h"
void process_data(float* data, int n)
{
if (ruapu_supports("avx512f"))
{
process_data_avx512(data, n);
}
else if (ruapu_supports("avx2"))
{
process_data_avx2(data, n);
}
else if (ruapu_supports("sse42"))
{
process_data_sse42(data, n);
}
else
{
process_data_generic(data, n);
}
}