Implementation:Ggml org Ggml Ggml gallocr alloc graph
Appearance
Summary
ggml_gallocr_alloc_graph allocates backend buffer memory for all tensors in a computation graph, applying memory reuse optimization so that intermediate tensors with non-overlapping lifetimes share the same memory regions.
API
bool ggml_gallocr_alloc_graph(ggml_gallocr_t galloc, struct ggml_cgraph * graph);
Parameters
| Parameter | Type | Description |
|---|---|---|
galloc |
ggml_gallocr_t |
Graph allocator handle (single-buffer or multi-buffer). |
graph |
struct ggml_cgraph * |
The computation graph whose tensors need memory allocation. |
Return Value
Returns bool — true on success, false on failure.
Source
- Repository: https://github.com/ggml-org/ggml
- File:
src/ggml-alloc.c— lines 1055–1101
Behavior
- Accepts a graph allocator and a computation graph.
- If the allocator is configured for a single buffer and has not yet been reserved, it auto-reserves by calling the internal reservation routine to compute the memory plan and allocate the backing buffer.
- Walks the computation graph and assigns each tensor to its pre-planned offset within the allocated buffer(s).
- Tensors whose lifetimes do not overlap are assigned to overlapping memory regions, enabling memory reuse and reducing peak memory consumption.
Dependencies
- Header:
ggml-alloc.h - Import:
#include "ggml-alloc.h"
Related Functions
ggml_backend_sched_alloc_graph
For multi-backend scenarios (e.g., splitting work across CPU and GPU), a higher-level scheduler entry point is provided:
bool ggml_backend_sched_alloc_graph(ggml_backend_sched_t sched, struct ggml_cgraph * graph);
- File:
src/ggml-backend.cpp— lines 1768–1785 - This function coordinates allocation across multiple backends, delegating per-backend allocation to
ggml_gallocr_alloc_graphinternally.
Related
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment