Implementation:Ggml org Ggml Ggml init
Metadata
| Field | Value |
|---|---|
| Page Type | Implementation (API Doc) |
| Knowledge Sources | GGML |
| Domains | ML_Infrastructure, Tensor_Computing |
| Last Updated | 2025-05-15 12:00 GMT |
Overview
Concrete tool for initializing a tensor metadata memory context (arena) provided by the GGML library.
Description
ggml_init is the primary entry point for creating a ggml_context, the arena-based memory pool that holds tensor metadata in GGML. It allocates (or accepts) a contiguous memory block and returns an opaque context pointer through which all subsequent tensor creation operations are performed.
The initialization process works as follows:
- Parameter validation: The function receives a
ggml_init_paramsstruct specifying the desired memory pool size, an optional external buffer, and allocation behavior flags. - Memory acquisition: If
mem_bufferisNULL, the function internally allocatesmem_sizebytes via the system allocator. Ifmem_bufferis provided, the function uses that externally managed buffer as the arena, avoiding any internal allocation. - Context initialization: Internal bookkeeping structures are set up, including the arena offset pointer (initially at the beginning of the pool), the total capacity, and the
no_allocflag. - Return: A pointer to the initialized
ggml_contextis returned. All subsequent calls toggml_new_tensorand related functions use this context to bump-allocate tensor metadata from the arena.
When the context is no longer needed, it is destroyed with ggml_free(), which releases the entire arena in a single operation -- freeing all tensor metadata it contains regardless of the number of tensors created.
The no_alloc flag is critical for backend-managed workflows: when set to true, the context arena stores only tensor metadata (shape, dtype, strides, name, graph linkage), and the actual tensor data buffers are allocated separately by a backend allocator (e.g., for GPU memory). This allows a small CPU-resident metadata arena to manage tensors whose data resides in device memory.
Code Reference
Source Location
GGML repo, file: src/ggml.c, lines L1522-1562.
Signature
struct ggml_context * ggml_init(struct ggml_init_params params);
Import
#include "ggml.h"
Dependencies
ggml.h-- public header declaringggml_init,ggml_init_params, andggml_context.
I/O Contract
Inputs
| Parameter | Type | Required | Description |
|---|---|---|---|
params.mem_size |
size_t |
Yes | Total size in bytes of the context memory pool. Determines the maximum amount of tensor metadata (and optionally tensor data) that can be allocated within this context. |
params.mem_buffer |
void * |
No | Optional pointer to an externally managed memory buffer. If NULL, ggml_init allocates the buffer internally. If non-NULL, the provided buffer is used as the arena (caller retains ownership of the underlying memory).
|
params.no_alloc |
bool |
No | When true, tensor data storage is not allocated within the context arena. Only tensor metadata structures are placed in the pool. Tensor data must be allocated separately via a backend allocator. When false, tensor data is co-located with metadata in the arena.
|
Outputs
| Output | Type | Description |
|---|---|---|
| Context pointer | struct ggml_context * |
An opaque pointer to the initialized memory context. Used as the first argument to all tensor creation functions (ggml_new_tensor, ggml_new_tensor_1d, etc.). Returns NULL on failure (e.g., if internal allocation fails or the maximum number of contexts is exceeded).
|
Usage Examples
Basic Context for Tensor Metadata and Data
#include "ggml.h"
// Allocate a context with 64 MB for both metadata and tensor data
struct ggml_init_params params = {
.mem_size = 64 * 1024 * 1024, // 64 MB
.mem_buffer = NULL, // let ggml allocate internally
.no_alloc = false, // tensor data stored in the arena
};
struct ggml_context * ctx = ggml_init(params);
// Create tensors -- metadata and data are both in the arena
struct ggml_tensor * a = ggml_new_tensor_2d(ctx, GGML_TYPE_F32, 768, 768);
struct ggml_tensor * b = ggml_new_tensor_2d(ctx, GGML_TYPE_F32, 768, 768);
// ... use tensors for computation ...
// Free the entire context (all tensor metadata and data) in one call
ggml_free(ctx);
Metadata-Only Context with Backend Allocator
#include "ggml.h"
// Small context for metadata only (tensor data managed by backend)
struct ggml_init_params params = {
.mem_size = 1 * 1024 * 1024, // 1 MB for metadata
.mem_buffer = NULL, // let ggml allocate internally
.no_alloc = true, // skip tensor data allocation
};
struct ggml_context * ctx = ggml_init(params);
// Create tensor metadata -- no data is allocated in the arena
struct ggml_tensor * weight = ggml_new_tensor_2d(ctx, GGML_TYPE_F16, 4096, 4096);
// Tensor data is allocated separately by the backend:
// ggml_backend_alloc_ctx_tensors(ctx, backend);
// ... perform computation using backend ...
ggml_free(ctx);
Using an External Memory Buffer
#include "ggml.h"
#include <stdlib.h>
// Provide a pre-allocated buffer as the arena
size_t buf_size = 16 * 1024 * 1024; // 16 MB
void * buf = malloc(buf_size);
struct ggml_init_params params = {
.mem_size = buf_size,
.mem_buffer = buf, // use external buffer
.no_alloc = false,
};
struct ggml_context * ctx = ggml_init(params);
// Create tensors within the externally provided buffer
struct ggml_tensor * t = ggml_new_tensor_1d(ctx, GGML_TYPE_F32, 1024);
// ... use tensor ...
// Free the context (does not free the external buffer)
ggml_free(ctx);
// Caller is responsible for freeing the external buffer
free(buf);