Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Ggml org Ggml Ggml init

From Leeroopedia


Metadata

Field Value
Page Type Implementation (API Doc)
Knowledge Sources GGML
Domains ML_Infrastructure, Tensor_Computing
Last Updated 2025-05-15 12:00 GMT

Overview

Concrete tool for initializing a tensor metadata memory context (arena) provided by the GGML library.

Description

ggml_init is the primary entry point for creating a ggml_context, the arena-based memory pool that holds tensor metadata in GGML. It allocates (or accepts) a contiguous memory block and returns an opaque context pointer through which all subsequent tensor creation operations are performed.

The initialization process works as follows:

  1. Parameter validation: The function receives a ggml_init_params struct specifying the desired memory pool size, an optional external buffer, and allocation behavior flags.
  2. Memory acquisition: If mem_buffer is NULL, the function internally allocates mem_size bytes via the system allocator. If mem_buffer is provided, the function uses that externally managed buffer as the arena, avoiding any internal allocation.
  3. Context initialization: Internal bookkeeping structures are set up, including the arena offset pointer (initially at the beginning of the pool), the total capacity, and the no_alloc flag.
  4. Return: A pointer to the initialized ggml_context is returned. All subsequent calls to ggml_new_tensor and related functions use this context to bump-allocate tensor metadata from the arena.

When the context is no longer needed, it is destroyed with ggml_free(), which releases the entire arena in a single operation -- freeing all tensor metadata it contains regardless of the number of tensors created.

The no_alloc flag is critical for backend-managed workflows: when set to true, the context arena stores only tensor metadata (shape, dtype, strides, name, graph linkage), and the actual tensor data buffers are allocated separately by a backend allocator (e.g., for GPU memory). This allows a small CPU-resident metadata arena to manage tensors whose data resides in device memory.

Code Reference

Source Location

GGML repo, file: src/ggml.c, lines L1522-1562.

Signature

struct ggml_context * ggml_init(struct ggml_init_params params);

Import

#include "ggml.h"

Dependencies

  • ggml.h -- public header declaring ggml_init, ggml_init_params, and ggml_context.

I/O Contract

Inputs

Parameter Type Required Description
params.mem_size size_t Yes Total size in bytes of the context memory pool. Determines the maximum amount of tensor metadata (and optionally tensor data) that can be allocated within this context.
params.mem_buffer void * No Optional pointer to an externally managed memory buffer. If NULL, ggml_init allocates the buffer internally. If non-NULL, the provided buffer is used as the arena (caller retains ownership of the underlying memory).
params.no_alloc bool No When true, tensor data storage is not allocated within the context arena. Only tensor metadata structures are placed in the pool. Tensor data must be allocated separately via a backend allocator. When false, tensor data is co-located with metadata in the arena.

Outputs

Output Type Description
Context pointer struct ggml_context * An opaque pointer to the initialized memory context. Used as the first argument to all tensor creation functions (ggml_new_tensor, ggml_new_tensor_1d, etc.). Returns NULL on failure (e.g., if internal allocation fails or the maximum number of contexts is exceeded).

Usage Examples

Basic Context for Tensor Metadata and Data

#include "ggml.h"

// Allocate a context with 64 MB for both metadata and tensor data
struct ggml_init_params params = {
    .mem_size   = 64 * 1024 * 1024,  // 64 MB
    .mem_buffer = NULL,               // let ggml allocate internally
    .no_alloc   = false,              // tensor data stored in the arena
};

struct ggml_context * ctx = ggml_init(params);

// Create tensors -- metadata and data are both in the arena
struct ggml_tensor * a = ggml_new_tensor_2d(ctx, GGML_TYPE_F32, 768, 768);
struct ggml_tensor * b = ggml_new_tensor_2d(ctx, GGML_TYPE_F32, 768, 768);

// ... use tensors for computation ...

// Free the entire context (all tensor metadata and data) in one call
ggml_free(ctx);

Metadata-Only Context with Backend Allocator

#include "ggml.h"

// Small context for metadata only (tensor data managed by backend)
struct ggml_init_params params = {
    .mem_size   = 1 * 1024 * 1024,   // 1 MB for metadata
    .mem_buffer = NULL,               // let ggml allocate internally
    .no_alloc   = true,               // skip tensor data allocation
};

struct ggml_context * ctx = ggml_init(params);

// Create tensor metadata -- no data is allocated in the arena
struct ggml_tensor * weight = ggml_new_tensor_2d(ctx, GGML_TYPE_F16, 4096, 4096);

// Tensor data is allocated separately by the backend:
// ggml_backend_alloc_ctx_tensors(ctx, backend);

// ... perform computation using backend ...

ggml_free(ctx);

Using an External Memory Buffer

#include "ggml.h"
#include <stdlib.h>

// Provide a pre-allocated buffer as the arena
size_t buf_size = 16 * 1024 * 1024;  // 16 MB
void * buf = malloc(buf_size);

struct ggml_init_params params = {
    .mem_size   = buf_size,
    .mem_buffer = buf,     // use external buffer
    .no_alloc   = false,
};

struct ggml_context * ctx = ggml_init(params);

// Create tensors within the externally provided buffer
struct ggml_tensor * t = ggml_new_tensor_1d(ctx, GGML_TYPE_F32, 1024);

// ... use tensor ...

// Free the context (does not free the external buffer)
ggml_free(ctx);

// Caller is responsible for freeing the external buffer
free(buf);

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment