Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Ggml org Ggml Python utils

From Leeroopedia


Implementation Metadata
File Name examples/python/ggml/utils.py
Repository ggml-org/ggml
Lines 182
Language Python
Domain Tags Python_Bindings, Tensor_Interop, Quantization
Status Active
Last Updated 2025-05-15 12:00 GMT
Knowledge Sources ggml-org/ggml repository

Overview

examples/python/ggml/utils.py is a utility module providing high-level helpers for interop between GGML tensors and numpy arrays, including automatic (de/re)quantization. This is the key usability layer for the Python bindings, making quantized tensor manipulation as simple as working with numpy arrays while handling GGML's quantization formats transparently.

Description

The module provides three main functions:

  • init(mem_size) -- Creates a GGML context with automatic GC-based freeing via ffi.gc(lib.ggml_init(params), lib.ggml_free)
  • copy(from_tensor, to_tensor) -- Transparently copies between numpy arrays and GGML tensors (including quantized ones) by detecting types and using appropriate dequantize/requantize paths. Validates shape consistency and supports an allow_requantize flag
  • numpy(tensor) -- Returns a numpy view over GGML tensor data (zero-copy for F32/I32) or a dequantized copy for quantized types. Supports allow_copy parameter

The TensorLike type alias (Union[ffi.CData, np.ndarray]) enables unified handling of both numpy and GGML tensors. Internal helpers manage type detection, shape validation, quantization block size alignment, and data pointer access.

Usage

from ggml.utils import init, copy, numpy

# Create a GGML context
ctx = init(16 * 1024 * 1024)  # 16 MB

# Convert a quantized GGML tensor to numpy
arr = numpy(quantized_tensor, allow_copy=True)

# Copy from numpy to a GGML tensor
copy(numpy_array, ggml_tensor)

Code Reference

Source Location

Repository File Lines
ggml-org/ggml examples/python/ggml/utils.py 182

Key Signatures

def init(mem_size: int, mem_buffer: ffi.CData = ffi.NULL, no_alloc: bool = False) -> ffi.CData:
    """Initialize a ggml context with automatic cleanup."""

TensorLike = Union[ffi.CData, np.ndarray]

def copy(from_tensor: TensorLike, to_tensor: TensorLike, allow_requantize: bool = True):
    """Copy between ggml and numpy tensors with transparent (de/re)quantization."""

def numpy(tensor: ffi.CData, allow_copy: Union[bool, np.ndarray] = False,
          allow_requantize=False) -> np.ndarray:
    """Convert a ggml tensor to a numpy array (view for unquantized, copy for quantized)."""

I/O Contract

Inputs

  • GGML tensors -- ffi.CData pointers to GGML tensor structs (possibly quantized)
  • numpy arrays -- Standard numpy ndarrays (float32, int32, etc.)
  • Configuration flags -- allow_copy, allow_requantize

Outputs

  • numpy arrays -- Views or copies of tensor data as numpy arrays
  • Errors -- ValueError for quantized tensors without allow_copy, AssertionError for shape mismatches

Usage Examples

Working with quantized tensors:

from ggml.utils import numpy, copy

# Get numpy view of an F32 tensor (zero-copy)
arr = numpy(f32_tensor)  # Returns a view, changes affect the tensor

# Dequantize a Q4_0 tensor to float32
arr = numpy(q4_0_tensor, allow_copy=True)  # Returns a copy

# Copy dequantized data into a pre-allocated array
output = np.empty(shape, dtype=np.float32)
arr = numpy(q4_0_tensor, allow_copy=output)

# Copy between different quantization types
copy(q4_0_tensor, q8_0_tensor, allow_requantize=True)

Related Pages

Implements Principle

Related Implementations

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment