Principle:Ggml org Ggml Backend Plugin Loading
| Principle Name | Backend Plugin Loading |
| Domain Tags | ML_Infrastructure, Hardware_Abstraction |
| Status | Active |
| Last Updated | 2025-05-15 12:00 GMT |
| Knowledge Sources | ggml-org/ggml repository |
Overview
Backend Plugin Loading is the architectural principle by which GGML achieves hardware abstraction through dynamic discovery and registration of compute backend shared libraries at runtime. Rather than requiring all hardware backends to be statically linked into a single monolithic binary, the system scans for and loads backend plugins as shared objects (.so on Linux/macOS, .dll on Windows), enabling flexible support for a wide array of hardware targets without recompilation.
Description
The principle addresses a fundamental challenge in ML infrastructure: supporting a large and growing set of hardware targets (CUDA, Metal, Vulkan, SYCL, OpenCL, CANN, HIP, RPC, Hexagon, MUSA, BLAS, ZenDNN, VirtGPU, CPU, and others) without forcing users to build or ship code for every possible backend. By treating backends as dynamically loadable plugins, GGML decouples the core tensor computation library from the specifics of any single hardware platform.
Each backend plugin is a shared library that exports a standard initialization entry point (ggml_backend_init) and an optional scoring function (ggml_backend_score). At load time, the system uses the operating system's dynamic linker (dlopen/dlsym on POSIX, LoadLibrary/GetProcAddress on Windows) to resolve these symbols and invoke them, obtaining a backend registration handle (ggml_backend_reg_t) that is then placed into a global registry.
The scoring mechanism enables a best-candidate selection pattern: when multiple variants of the same backend exist (e.g., different CUDA builds optimized for different GPU architectures), the loader evaluates each candidate's self-reported score and selects the highest-scoring one. A score of zero signals that the backend is not supported on the current system, allowing graceful fallback.
Usage
This principle is applied whenever an application wants to support heterogeneous hardware without compile-time coupling. Typical usage involves:
- Calling a single initialization function that triggers auto-discovery of all available backends.
- Allowing the runtime environment to determine which backends are present based on which shared libraries are installed in the search path.
- Supporting user-specified search directories and out-of-tree backends via environment variables (e.g.,
GGML_BACKEND_PATH). - Supporting compile-time configuration via
GGML_BACKEND_DIRto set a default search directory.
Theoretical Basis
Plugin Registry Pattern
The backend plugin loading mechanism is an instance of the Plugin Registry design pattern. In this pattern, a central registry object maintains a collection of registered plugins, each conforming to a common interface. Plugins are discovered and loaded at runtime rather than being hard-coded at compile time. The registry provides enumeration, lookup-by-name, and lookup-by-type operations over all registered backends and their associated devices.
In GGML, the registry is implemented as a singleton ggml_backend_registry struct containing vectors of backend entries and device handles. The singleton is lazily initialized on first access, which also triggers static registration of any compile-time-linked backends.
Dynamic Linking and Symbol Resolution
The principle relies on the operating system's dynamic linker infrastructure:
- POSIX systems (Linux, macOS, FreeBSD): The
dlopenfunction loads a shared object into the process address space withRTLD_NOW | RTLD_LOCALflags, ensuring immediate symbol resolution within a local scope.dlsymthen retrieves function pointers by name. - Windows:
LoadLibraryWloads a DLL, andGetProcAddressresolves exported symbols. Error dialogs for missing DLLs are suppressed viaSetErrorMode(SEM_FAILCRITICALERRORS).
This two-step process -- load library, then resolve well-known symbol names -- forms the contract between the core framework and any backend plugin.
Auto-Discovery and Best-Candidate Selection
The auto-discovery mechanism scans configurable search paths (the executable directory, the current working directory, or a user-supplied path) for files matching a naming convention:
- Linux/macOS:
libggml-{name}-*.soorlibggml-{name}.so - Windows:
ggml-{name}-*.dllorggml-{name}.dll
When multiple files match a given backend name, each is temporarily loaded and its ggml_backend_score function is called. The file yielding the highest score is selected and fully registered. If no scored variant is found, the base library (without a suffix) is attempted as a fallback.
API Version Compatibility
Each backend plugin carries an api_version field in its registration structure. The loader verifies this against GGML_BACKEND_API_VERSION (currently version 2) to ensure binary compatibility. Plugins compiled against an incompatible API version are rejected with an error, preventing undefined behavior from ABI mismatches.