Principle:Ollama Ollama Tensor Reading

Knowledge Sources	Ollama SafeTensors Format
Domains	Format_Conversion, Data_Processing
Last Updated	2026-02-14 00:00 GMT

Overview

A format-agnostic tensor reading mechanism that parses model weight tensors from SafeTensors or PyTorch formats and maps tensor names from HuggingFace conventions to GGUF conventions.

Description

Tensor Reading is the process of extracting weight tensors from serialized model files and remapping their names for the target format. HuggingFace models use naming conventions like model.layers.0.self_attn.q_proj.weight, while GGUF uses blk.0.attn_q.weight. Each model architecture has its own name mapping rules.

The reader supports multiple source formats: SafeTensors (JSON header + raw tensor data) and PyTorch (.bin/.pth with pickle serialization). The reading is lazy — tensor data is memory-mapped and only read when the encoder writes the output file.

Usage

Use this principle when implementing a model format converter that must read tensors from various serialization formats and remap names according to architecture-specific rules.

Theoretical Basis

The reading process:

Format Detection: Check for .safetensors files (preferred) or .bin/.pth files (PyTorch fallback).
Header Parsing: For SafeTensors, read the JSON header that contains tensor metadata (name, dtype, shape, offset).
Name Remapping: Apply architecture-specific string replacements to convert HuggingFace names to GGUF names.
Lazy Loading: Return tensor objects with data readers that will read from the memory-mapped file on demand.
Type Conversion: Handle data type mappings (e.g., bfloat16 → float16 for GGUF compatibility).

Related Pages

Implemented By

Implementation:Ollama_Ollama_ParseTensors

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment