Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:LaurentMazare Tch rs Tensor Format Conversion

From Leeroopedia


Knowledge Sources
Domains Data Serialization, Deep Learning
Last Updated 2026-02-08 00:00 GMT

Overview

Tensor format conversion enables interoperability between deep learning frameworks by translating tensor data between different serialization formats with varying tradeoffs in safety, performance, and compatibility.

Description

Neural network models and their associated tensor data are stored in various serialization formats, each with distinct characteristics. Conversion between these formats is essential for transferring models and data across frameworks, languages, and deployment environments. The major formats include:

  • NumPy formats (.npy, .npz): The npy format stores a single array with a small header describing shape, dtype, and byte order. The npz format is a ZIP archive of multiple npy files, enabling storage of multiple named arrays. These formats are simple, widely supported, and serve as a lingua franca for numerical data exchange. However, they require loading the entire array into memory and provide no built-in integrity checks.
  • Safetensors (.safetensors): A format designed for security and speed. Unlike pickle-based formats, safetensors does not allow arbitrary code execution during loading, eliminating a class of deserialization attacks. It supports memory-mapped loading, allowing lazy access to individual tensors without loading the entire file. The format stores a JSON header with tensor metadata followed by raw tensor data.
  • ONNX Tensor (.ot): A format used for framework-agnostic model weight storage, compatible with the ONNX ecosystem. Enables model deployment across different inference runtimes.
  • PyTorch format (.pt): Uses Python's pickle protocol to serialize tensor data along with metadata. Supports complex nested structures (dictionaries, lists of tensors) but carries security risks because pickle can execute arbitrary code during deserialization. Loading untrusted .pt files is dangerous.

Usage

Format conversion is needed when migrating models between training frameworks, deploying models to production inference systems, sharing pre-trained weights across research groups, converting between safe and legacy formats, or enabling cross-language model access.

Theoretical Basis

Serialization Components:

A tensor serialization format must encode:

  1. Shape: The dimensions (d1,d2,,dn)
  2. Data type: Element type (float32, float16, int64, etc.)
  3. Data: The raw element values in a defined byte order
  4. Name/key: An identifier for the tensor within a collection

Format Comparison:

Property .npy/.npz .safetensors .pt .ot
Safety Safe Safe Unsafe (pickle) Safe
Memory mapping No Yes No Depends
Multi-tensor .npz only Yes Yes Yes
Metadata Minimal JSON header Python objects Protocol Buffers
Cross-language Good Excellent Python-centric Good

Memory-Mapped Loading:

Memory-mapped formats allow accessing tensor data without loading the entire file:

LOAD_TENSOR(file, tensor_name):
    header := read_header(file)
    offset, size := header.get_tensor_location(tensor_name)
    return memory_map(file, offset, size)

This enables O(1) loading time for individual tensors regardless of total file size.

Conversion Pipeline:

Conversion between formats follows a common pattern:

CONVERT(source_file, source_format, target_format):
    tensors := {}
    for name, metadata in source_format.read_header(source_file):
        data := source_format.read_tensor(source_file, name)
        tensors[name] := (metadata, data)
    target_format.write(target_file, tensors)

Data Type Mapping:

Different formats may support different sets of data types. Conversion may require type promotion or casting:

float16float32float64

When the target format does not support the source dtype, lossless promotion (e.g., float16 to float32) is preferred over lossy truncation.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment