Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Ggml_org_Ggml_Tensor_Creation

From Leeroopedia


Template:Principle

Summary

Creating typed, multi-dimensional tensor objects for numerical computation. Tensors serve as the fundamental data abstraction for machine learning workloads in GGML: typed arrays carrying shape, stride, and metadata through every stage of a computation graph.

Theory

Tensors generalize scalars, vectors, and matrices to n-dimensional arrays. Two key layout decisions govern how elements map to contiguous memory:

  • Row-major order (C-style) -- the last index varies fastest in memory.
  • Column-major order (Fortran-style) -- the first index varies fastest in memory.

GGML uses a column-major convention internally: for a 2-D tensor with dimensions ne0 x ne1, elements along ne0 are contiguous. Strides (nb0, nb1, nb2, nb3) record the byte distance between successive elements along each axis, enabling views and transposes without copying data.

Type System for Quantized Representations

A distinguishing feature of GGML is its type system designed for quantized inference. Beyond the standard IEEE types, GGML defines block-quantized formats that pack weights into compact representations with per-block scale factors:

Category Example Types
IEEE floating point GGML_TYPE_F32, GGML_TYPE_F16, GGML_TYPE_BF16
Integer GGML_TYPE_I8, GGML_TYPE_I16, GGML_TYPE_I32
Block-quantized (4-bit) GGML_TYPE_Q4_0, GGML_TYPE_Q4_1, GGML_TYPE_Q4_K
Block-quantized (5-bit) GGML_TYPE_Q5_0, GGML_TYPE_Q5_1, GGML_TYPE_Q5_K
Block-quantized (8-bit) GGML_TYPE_Q8_0, GGML_TYPE_Q8_1, GGML_TYPE_Q8_K
K-quant mixed GGML_TYPE_Q2_K, GGML_TYPE_Q3_K, GGML_TYPE_Q6_K

GGML supports 30+ element types in total. Each type entry in the internal type-traits table records the block size, the byte size per block, and conversion routines to/from float, so that tensor operations can be dispatched generically regardless of the underlying representation.

Core Concepts

  1. Shape -- an ordered tuple of up to 4 dimension sizes (ne[0] .. ne[3]). Unused trailing dimensions default to 1.
  2. Stride -- byte offsets (nb[0] .. nb[3]) that describe memory layout. Non-contiguous strides enable zero-copy views and transposes.
  3. Element type -- one of the enum ggml_type values. Determines byte width, quantization block size, and available kernels.
  4. Context allocation -- every tensor is allocated from a ggml_context, a bump-pointer arena that owns the tensor metadata (and optionally the data buffer).
  5. Backend data -- tensor storage can live on CPU, GPU, or other accelerator memory managed by ggml_backend.

Related

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment