Implementation:LMCache LMCache Fast Serde
| Knowledge Sources | |
|---|---|
| Domains | Serialization, KV Cache |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
Provides a fast, raw-bytes serialization and deserialization implementation for PyTorch tensors.
Description
The FastSerializer converts a PyTorch tensor to raw bytes by making it contiguous, moving to CPU, viewing as uint8, converting to a NumPy array, and calling tobytes(). This approach avoids metadata overhead for maximum speed but does not preserve shape or dtype information in the serialized output. The FastDeserializer reconstructs a tensor from raw bytes using torch.frombuffer with a dtype specified at construction time. This serde pair is suitable when the consumer knows the expected shape and dtype a priori.
Usage
Use FastSerializer and FastDeserializer when maximum serialization throughput is needed and the tensor shape and dtype are known by both the producer and consumer, such as in local cache storage where metadata is tracked separately.
Code Reference
Source Location
- Repository: LMCache
- File: lmcache/storage_backend/serde/fast_serde.py
- Lines: 1-31
Signature
class FastSerializer(Serializer):
def __init__(self): ...
def to_bytes(self, t: torch.Tensor) -> bytes: ...
class FastDeserializer(Deserializer):
def __init__(self, dtype): ...
def from_bytes_normal(self, b: bytes) -> torch.Tensor: ...
def from_bytes(self, b: bytes) -> torch.Tensor: ...
Import
from lmcache.storage_backend.serde.fast_serde import FastSerializer, FastDeserializer
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| t | torch.Tensor | Yes (to_bytes) | Tensor to serialize; can be on any device with any shape |
| dtype | torch.dtype | Yes (FastDeserializer constructor) | Expected dtype of the deserialized tensor |
| b | bytes | Yes (from_bytes) | Raw bytes to deserialize |
Outputs
| Name | Type | Description |
|---|---|---|
| bytes | bytes | Raw byte representation of the tensor data (from to_bytes) |
| tensor | torch.Tensor | Flat 1-D tensor reconstructed from raw bytes (from from_bytes) |
Usage Examples
import torch
from lmcache.storage_backend.serde.fast_serde import FastSerializer, FastDeserializer
serializer = FastSerializer()
deserializer = FastDeserializer(dtype=torch.float16)
# Serialize
tensor = torch.randn(32, 128, dtype=torch.float16)
raw = serializer.to_bytes(tensor)
# Deserialize (shape must be known externally)
flat_tensor = deserializer.from_bytes(raw)
restored = flat_tensor.reshape(32, 128)