Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Ollama Ollama Tensor Reading

From Leeroopedia
Knowledge Sources
Domains Format_Conversion, Data_Processing
Last Updated 2026-02-14 00:00 GMT

Overview

A format-agnostic tensor reading mechanism that parses model weight tensors from SafeTensors or PyTorch formats and maps tensor names from HuggingFace conventions to GGUF conventions.

Description

Tensor Reading is the process of extracting weight tensors from serialized model files and remapping their names for the target format. HuggingFace models use naming conventions like model.layers.0.self_attn.q_proj.weight, while GGUF uses blk.0.attn_q.weight. Each model architecture has its own name mapping rules.

The reader supports multiple source formats: SafeTensors (JSON header + raw tensor data) and PyTorch (.bin/.pth with pickle serialization). The reading is lazy — tensor data is memory-mapped and only read when the encoder writes the output file.

Usage

Use this principle when implementing a model format converter that must read tensors from various serialization formats and remap names according to architecture-specific rules.

Theoretical Basis

The reading process:

  1. Format Detection: Check for .safetensors files (preferred) or .bin/.pth files (PyTorch fallback).
  2. Header Parsing: For SafeTensors, read the JSON header that contains tensor metadata (name, dtype, shape, offset).
  3. Name Remapping: Apply architecture-specific string replacements to convert HuggingFace names to GGUF names.
  4. Lazy Loading: Return tensor objects with data readers that will read from the memory-mapped file on demand.
  5. Type Conversion: Handle data type mappings (e.g., bfloat16 → float16 for GGUF compatibility).

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment