Implementation:Ggml org Ggml Metal backend
Metadata
| Field | Value |
|---|---|
| Page Type | Implementation (API Doc) |
| Knowledge Sources | GGML |
| Domains | ML_Infrastructure, Tensor_Computing, GPU_Computing |
| Last Updated | 2025-05-15 12:00 GMT |
Overview
Implements the GGML backend interface for Apple Metal, bridging the generic backend API to Metal-specific buffer management, device enumeration, and compute dispatch.
Description
ggml-metal.cpp is the main entry point for Metal backend integration. It implements the complete backend lifecycle:
- Buffer types: Three buffer memory models are supported:
- Shared (CPU-accessible via
ggml_backend_metal_buffer_shared_i) -- memory visible to both CPU and GPU - Private (GPU-only via
ggml_backend_metal_buffer_private_i) -- faster GPU memory not directly accessible from CPU - Mapped -- shared memory with memory-mapped semantics
- Each buffer type provides a full set of interface callbacks:
free_buffer,get_base,memset_tensor,set_tensor,get_tensor,cpy_tensor, andclear. All delegate to the underlyingggml_metal_buffer_*functions with appropriate assertions.
- Shared (CPU-accessible via
- Device management: Supports up to 16 Metal devices (
GGML_METAL_MAX_DEVICES). TheGGML_METAL_DEVICESenvironment variable can override the device count to simulate virtual devices. - Backend registration: The
ggml_backend_metal_regfunction returns the backend registration handle, wiring up buffer type management and device enumeration. - Graph computation: Delegates to
ggml-metal-ops.cppfor the actual kernel dispatch on the GPU.
Usage
Users initialize the Metal backend by calling ggml_backend_metal_init() (declared in include/ggml-metal.h). The backend can then be passed to the GGML scheduler for graph computation. Typically, backend discovery is handled automatically by ggml_backend_load_all().
Code Reference
Source Location
GGML repo, file: src/ggml-metal/ggml-metal.cpp (937 lines).
Signatures
ggml_backend_t ggml_backend_metal_init(void);
ggml_backend_reg_t ggml_backend_metal_reg(void);
// Buffer type accessors:
static ggml_backend_buffer_type_t ggml_backend_metal_buffer_type_shared(int device);
static ggml_backend_buffer_type_t ggml_backend_metal_buffer_type_private(int device);
static ggml_backend_buffer_type_t ggml_backend_metal_buffer_type_mapped(int device);
// Buffer interface callbacks (shared):
static void ggml_backend_metal_buffer_shared_free_buffer(ggml_backend_buffer_t buffer);
static void * ggml_backend_metal_buffer_shared_get_base(ggml_backend_buffer_t buffer);
static void ggml_backend_metal_buffer_shared_set_tensor(ggml_backend_buffer_t buffer, ggml_tensor * tensor, const void * data, size_t offset, size_t size);
static void ggml_backend_metal_buffer_shared_get_tensor(ggml_backend_buffer_t buffer, const ggml_tensor * tensor, void * data, size_t offset, size_t size);
Import
#include "ggml-metal.h"
I/O Contract
Inputs
| Parameter | Type | Required | Description |
|---|---|---|---|
| (none for init) | -- | -- | ggml_backend_metal_init takes no parameters; it initializes the default Metal device.
|
buffer |
ggml_backend_buffer_t |
Yes | Backend buffer handle for buffer operations (set_tensor, get_tensor, etc.). |
tensor |
ggml_tensor * |
Yes | Target tensor for data transfer operations. |
data |
const void * |
Yes | Source or destination data pointer for host-device transfers. |
Outputs
| Output | Type | Description |
|---|---|---|
| Backend handle | ggml_backend_t |
Opaque backend handle for use with the GGML scheduler and graph computation. |
| Registration handle | ggml_backend_reg_t |
Backend registration for the auto-discovery system. |
Usage Examples
#include "ggml-metal.h"
#include "ggml-backend.h"
// Initialize the Metal backend
ggml_backend_t metal_backend = ggml_backend_metal_init();
// Use with scheduler
ggml_backend_sched_t sched = ggml_backend_sched_new(
&metal_backend, NULL, 1, GGML_DEFAULT_GRAPH_SIZE, false);
// Compute a graph
ggml_backend_sched_graph_compute(sched, graph);
// Cleanup
ggml_backend_sched_free(sched);
ggml_backend_free(metal_backend);