Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Ggml org Ggml Metal backend

From Leeroopedia


Metadata

Field Value
Page Type Implementation (API Doc)
Knowledge Sources GGML
Domains ML_Infrastructure, Tensor_Computing, GPU_Computing
Last Updated 2025-05-15 12:00 GMT

Overview

Implements the GGML backend interface for Apple Metal, bridging the generic backend API to Metal-specific buffer management, device enumeration, and compute dispatch.

Description

ggml-metal.cpp is the main entry point for Metal backend integration. It implements the complete backend lifecycle:

  1. Buffer types: Three buffer memory models are supported:
    • Shared (CPU-accessible via ggml_backend_metal_buffer_shared_i) -- memory visible to both CPU and GPU
    • Private (GPU-only via ggml_backend_metal_buffer_private_i) -- faster GPU memory not directly accessible from CPU
    • Mapped -- shared memory with memory-mapped semantics
    Each buffer type provides a full set of interface callbacks: free_buffer, get_base, memset_tensor, set_tensor, get_tensor, cpy_tensor, and clear. All delegate to the underlying ggml_metal_buffer_* functions with appropriate assertions.
  2. Device management: Supports up to 16 Metal devices (GGML_METAL_MAX_DEVICES). The GGML_METAL_DEVICES environment variable can override the device count to simulate virtual devices.
  3. Backend registration: The ggml_backend_metal_reg function returns the backend registration handle, wiring up buffer type management and device enumeration.
  4. Graph computation: Delegates to ggml-metal-ops.cpp for the actual kernel dispatch on the GPU.

Usage

Users initialize the Metal backend by calling ggml_backend_metal_init() (declared in include/ggml-metal.h). The backend can then be passed to the GGML scheduler for graph computation. Typically, backend discovery is handled automatically by ggml_backend_load_all().

Code Reference

Source Location

GGML repo, file: src/ggml-metal/ggml-metal.cpp (937 lines).

Signatures

ggml_backend_t ggml_backend_metal_init(void);
ggml_backend_reg_t ggml_backend_metal_reg(void);

// Buffer type accessors:
static ggml_backend_buffer_type_t ggml_backend_metal_buffer_type_shared(int device);
static ggml_backend_buffer_type_t ggml_backend_metal_buffer_type_private(int device);
static ggml_backend_buffer_type_t ggml_backend_metal_buffer_type_mapped(int device);

// Buffer interface callbacks (shared):
static void ggml_backend_metal_buffer_shared_free_buffer(ggml_backend_buffer_t buffer);
static void * ggml_backend_metal_buffer_shared_get_base(ggml_backend_buffer_t buffer);
static void ggml_backend_metal_buffer_shared_set_tensor(ggml_backend_buffer_t buffer, ggml_tensor * tensor, const void * data, size_t offset, size_t size);
static void ggml_backend_metal_buffer_shared_get_tensor(ggml_backend_buffer_t buffer, const ggml_tensor * tensor, void * data, size_t offset, size_t size);

Import

#include "ggml-metal.h"

I/O Contract

Inputs

Parameter Type Required Description
(none for init) -- -- ggml_backend_metal_init takes no parameters; it initializes the default Metal device.
buffer ggml_backend_buffer_t Yes Backend buffer handle for buffer operations (set_tensor, get_tensor, etc.).
tensor ggml_tensor * Yes Target tensor for data transfer operations.
data const void * Yes Source or destination data pointer for host-device transfers.

Outputs

Output Type Description
Backend handle ggml_backend_t Opaque backend handle for use with the GGML scheduler and graph computation.
Registration handle ggml_backend_reg_t Backend registration for the auto-discovery system.

Usage Examples

#include "ggml-metal.h"
#include "ggml-backend.h"

// Initialize the Metal backend
ggml_backend_t metal_backend = ggml_backend_metal_init();

// Use with scheduler
ggml_backend_sched_t sched = ggml_backend_sched_new(
    &metal_backend, NULL, 1, GGML_DEFAULT_GRAPH_SIZE, false);

// Compute a graph
ggml_backend_sched_graph_compute(sched, graph);

// Cleanup
ggml_backend_sched_free(sched);
ggml_backend_free(metal_backend);

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment