Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:LMCache LMCache Fast Serde

From Leeroopedia


Knowledge Sources
Domains Serialization, KV Cache
Last Updated 2026-02-09 00:00 GMT

Overview

Provides a fast, raw-bytes serialization and deserialization implementation for PyTorch tensors.

Description

The FastSerializer converts a PyTorch tensor to raw bytes by making it contiguous, moving to CPU, viewing as uint8, converting to a NumPy array, and calling tobytes(). This approach avoids metadata overhead for maximum speed but does not preserve shape or dtype information in the serialized output. The FastDeserializer reconstructs a tensor from raw bytes using torch.frombuffer with a dtype specified at construction time. This serde pair is suitable when the consumer knows the expected shape and dtype a priori.

Usage

Use FastSerializer and FastDeserializer when maximum serialization throughput is needed and the tensor shape and dtype are known by both the producer and consumer, such as in local cache storage where metadata is tracked separately.

Code Reference

Source Location

Signature

class FastSerializer(Serializer):
    def __init__(self): ...
    def to_bytes(self, t: torch.Tensor) -> bytes: ...

class FastDeserializer(Deserializer):
    def __init__(self, dtype): ...
    def from_bytes_normal(self, b: bytes) -> torch.Tensor: ...
    def from_bytes(self, b: bytes) -> torch.Tensor: ...

Import

from lmcache.storage_backend.serde.fast_serde import FastSerializer, FastDeserializer

I/O Contract

Inputs

Name Type Required Description
t torch.Tensor Yes (to_bytes) Tensor to serialize; can be on any device with any shape
dtype torch.dtype Yes (FastDeserializer constructor) Expected dtype of the deserialized tensor
b bytes Yes (from_bytes) Raw bytes to deserialize

Outputs

Name Type Description
bytes bytes Raw byte representation of the tensor data (from to_bytes)
tensor torch.Tensor Flat 1-D tensor reconstructed from raw bytes (from from_bytes)

Usage Examples

import torch
from lmcache.storage_backend.serde.fast_serde import FastSerializer, FastDeserializer

serializer = FastSerializer()
deserializer = FastDeserializer(dtype=torch.float16)

# Serialize
tensor = torch.randn(32, 128, dtype=torch.float16)
raw = serializer.to_bytes(tensor)

# Deserialize (shape must be known externally)
flat_tensor = deserializer.from_bytes(raw)
restored = flat_tensor.reshape(32, 128)

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment