Implementation:LMCache LMCache Fast Serde

Knowledge Sources	LMCache
Domains	Serialization, KV Cache
Last Updated	2026-02-09 00:00 GMT

Overview

Provides a fast, raw-bytes serialization and deserialization implementation for PyTorch tensors.

Description

The FastSerializer converts a PyTorch tensor to raw bytes by making it contiguous, moving to CPU, viewing as uint8, converting to a NumPy array, and calling tobytes(). This approach avoids metadata overhead for maximum speed but does not preserve shape or dtype information in the serialized output. The FastDeserializer reconstructs a tensor from raw bytes using torch.frombuffer with a dtype specified at construction time. This serde pair is suitable when the consumer knows the expected shape and dtype a priori.

Usage

Use FastSerializer and FastDeserializer when maximum serialization throughput is needed and the tensor shape and dtype are known by both the producer and consumer, such as in local cache storage where metadata is tracked separately.

Code Reference

Source Location

Repository: LMCache
File: lmcache/storage_backend/serde/fast_serde.py
Lines: 1-31

Signature

class FastSerializer(Serializer):
    def __init__(self): ...
    def to_bytes(self, t: torch.Tensor) -> bytes: ...

class FastDeserializer(Deserializer):
    def __init__(self, dtype): ...
    def from_bytes_normal(self, b: bytes) -> torch.Tensor: ...
    def from_bytes(self, b: bytes) -> torch.Tensor: ...

Import

from lmcache.storage_backend.serde.fast_serde import FastSerializer, FastDeserializer

I/O Contract

Inputs

Name	Type	Required	Description
t	torch.Tensor	Yes (to_bytes)	Tensor to serialize; can be on any device with any shape
dtype	torch.dtype	Yes (FastDeserializer constructor)	Expected dtype of the deserialized tensor
b	bytes	Yes (from_bytes)	Raw bytes to deserialize

Outputs

Name	Type	Description
bytes	bytes	Raw byte representation of the tensor data (from to_bytes)
tensor	torch.Tensor	Flat 1-D tensor reconstructed from raw bytes (from from_bytes)

Usage Examples

import torch
from lmcache.storage_backend.serde.fast_serde import FastSerializer, FastDeserializer

serializer = FastSerializer()
deserializer = FastDeserializer(dtype=torch.float16)

# Serialize
tensor = torch.randn(32, 128, dtype=torch.float16)
raw = serializer.to_bytes(tensor)

# Deserialize (shape must be known externally)
flat_tensor = deserializer.from_bytes(raw)
restored = flat_tensor.reshape(32, 128)

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment