Implementation:LMCache LMCache V1 Protocol
| Knowledge Sources | |
|---|---|
| Domains | Network Protocol, Serialization |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
This module defines the wire protocol for TCP-based communication between LMCache clients and the standalone remote server, including command types, dtype mappings, and serializable message structures.
Description
The protocol is built around two message types: ClientMetaMessage (requests from workers) and ServerMetaMessage (responses from the server). Both use struct.pack/struct.unpack for binary serialization with fixed-size formats. ClientCommand enumerates supported operations (PUT, GET, EXIST, LIST, HEALTH), while ServerReturnCode indicates success (200) or failure (400). The module also defines RemoteMetadata for multi-group KV cache metadata serialization, with configurable format strings based on the number of layer groups. Bidirectional mappings between PyTorch dtypes and integers (DTYPE_TO_INT/INT_TO_DTYPE) and between storage locations and integers (LOCATION_TO_INT/INT_TO_LOCATION) enable compact binary encoding. Key strings are fixed-length padded to MAX_KEY_LENGTH (150 bytes).
Usage
Use this module when implementing or extending the TCP-based remote storage backend for LMCache. Both the standalone server and the remote client use these message types for all communication.
Code Reference
Source Location
- Repository: LMCache
- File: lmcache/v1/protocol.py
- Lines: 1-257
Signature
class ClientCommand(IntEnum):
PUT = auto()
GET = auto()
EXIST = auto()
LIST = auto()
HEALTH = auto()
class ServerReturnCode(IntEnum):
SUCCESS = 200
FAIL = 400
@dataclass
class RemoteMetadata:
length: int
shapes: list[torch.Size]
dtypes: list[torch.dtype]
fmt: MemoryFormat
def serialize(self) -> bytes: ...
def serialize_into(self, buffer): ...
@staticmethod
def deserialize(s: bytes) -> "RemoteMetadata": ...
@dataclass
class ClientMetaMessage:
command: ClientCommand
key: Union[CacheEngineKey, LayerCacheEngineKey]
length: int
fmt: MemoryFormat
dtype: Optional[torch.dtype]
shape: torch.Size
location: Optional[str] = None
def serialize(self) -> bytes: ...
@staticmethod
def deserialize(s: bytes) -> "ClientMetaMessage": ...
@staticmethod
def packlength() -> int: ...
@dataclass
class ServerMetaMessage:
code: ServerReturnCode
length: int
fmt: MemoryFormat
dtype: Optional[torch.dtype]
shape: torch.Size
location: Optional[str] = None
def serialize(self) -> bytes: ...
@staticmethod
def deserialize(s: bytes) -> "ServerMetaMessage": ...
@staticmethod
def packlength() -> int: ...
def init_remote_metadata_info(num_groups: int): ...
def get_remote_metadata_bytes() -> int: ...
Import
from lmcache.v1.protocol import (
ClientCommand,
ServerReturnCode,
ClientMetaMessage,
ServerMetaMessage,
RemoteMetadata,
init_remote_metadata_info,
get_remote_metadata_bytes,
DTYPE_TO_INT,
INT_TO_DTYPE,
)
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| command | ClientCommand | Yes | The operation to perform (PUT, GET, EXIST, LIST, HEALTH) |
| key | Union[CacheEngineKey, LayerCacheEngineKey] | Yes | Cache key identifying the target entry |
| length | int | Yes | Length of the data payload in bytes |
| fmt | MemoryFormat | Yes | Memory format of the cached data |
| dtype | Optional[torch.dtype] | Yes | Data type of the tensor (mapped to int for serialization) |
| shape | torch.Size | Yes | 4-dimensional shape of the cached tensor |
| location | Optional[str] | No | Storage backend location identifier |
| num_groups | int | Yes | Number of KV layer groups (for init_remote_metadata_info) |
Outputs
| Name | Type | Description |
|---|---|---|
| serialize | bytes | Binary-encoded message ready for TCP transmission |
| deserialize | ClientMetaMessage or ServerMetaMessage | Deserialized message from raw bytes |
| packlength | int | Fixed byte size of the serialized message header |
| code | ServerReturnCode | SUCCESS (200) or FAIL (400) status |
Usage Examples
from lmcache.v1.protocol import ClientMetaMessage, ClientCommand, ServerMetaMessage
from lmcache.v1.memory_management import MemoryFormat
import torch
# Create and serialize a client PUT request
msg = ClientMetaMessage(
command=ClientCommand.PUT,
key=cache_key,
length=4096,
fmt=MemoryFormat.KV_BLOB,
dtype=torch.float16,
shape=torch.Size([2, 32, 256, 128]),
)
data = msg.serialize()
# Deserialize a server response
response = ServerMetaMessage.deserialize(raw_bytes)
if response.code == ServerReturnCode.SUCCESS:
print(f"Received {response.length} bytes")