Implementation:LMCache LMCache V1 Protocol

Knowledge Sources	LMCache
Domains	Network Protocol, Serialization
Last Updated	2026-02-09 00:00 GMT

Overview

This module defines the wire protocol for TCP-based communication between LMCache clients and the standalone remote server, including command types, dtype mappings, and serializable message structures.

Description

The protocol is built around two message types: ClientMetaMessage (requests from workers) and ServerMetaMessage (responses from the server). Both use struct.pack/struct.unpack for binary serialization with fixed-size formats. ClientCommand enumerates supported operations (PUT, GET, EXIST, LIST, HEALTH), while ServerReturnCode indicates success (200) or failure (400). The module also defines RemoteMetadata for multi-group KV cache metadata serialization, with configurable format strings based on the number of layer groups. Bidirectional mappings between PyTorch dtypes and integers (DTYPE_TO_INT/INT_TO_DTYPE) and between storage locations and integers (LOCATION_TO_INT/INT_TO_LOCATION) enable compact binary encoding. Key strings are fixed-length padded to MAX_KEY_LENGTH (150 bytes).

Usage

Use this module when implementing or extending the TCP-based remote storage backend for LMCache. Both the standalone server and the remote client use these message types for all communication.

Code Reference

Source Location

Repository: LMCache
File: lmcache/v1/protocol.py
Lines: 1-257

Signature

class ClientCommand(IntEnum):
    PUT = auto()
    GET = auto()
    EXIST = auto()
    LIST = auto()
    HEALTH = auto()

class ServerReturnCode(IntEnum):
    SUCCESS = 200
    FAIL = 400

@dataclass
class RemoteMetadata:
    length: int
    shapes: list[torch.Size]
    dtypes: list[torch.dtype]
    fmt: MemoryFormat

    def serialize(self) -> bytes: ...
    def serialize_into(self, buffer): ...
    @staticmethod
    def deserialize(s: bytes) -> "RemoteMetadata": ...

@dataclass
class ClientMetaMessage:
    command: ClientCommand
    key: Union[CacheEngineKey, LayerCacheEngineKey]
    length: int
    fmt: MemoryFormat
    dtype: Optional[torch.dtype]
    shape: torch.Size
    location: Optional[str] = None

    def serialize(self) -> bytes: ...
    @staticmethod
    def deserialize(s: bytes) -> "ClientMetaMessage": ...
    @staticmethod
    def packlength() -> int: ...

@dataclass
class ServerMetaMessage:
    code: ServerReturnCode
    length: int
    fmt: MemoryFormat
    dtype: Optional[torch.dtype]
    shape: torch.Size
    location: Optional[str] = None

    def serialize(self) -> bytes: ...
    @staticmethod
    def deserialize(s: bytes) -> "ServerMetaMessage": ...
    @staticmethod
    def packlength() -> int: ...

def init_remote_metadata_info(num_groups: int): ...
def get_remote_metadata_bytes() -> int: ...

Import

from lmcache.v1.protocol import (
    ClientCommand,
    ServerReturnCode,
    ClientMetaMessage,
    ServerMetaMessage,
    RemoteMetadata,
    init_remote_metadata_info,
    get_remote_metadata_bytes,
    DTYPE_TO_INT,
    INT_TO_DTYPE,
)

I/O Contract

Inputs

Name	Type	Required	Description
command	ClientCommand	Yes	The operation to perform (PUT, GET, EXIST, LIST, HEALTH)
key	Union[CacheEngineKey, LayerCacheEngineKey]	Yes	Cache key identifying the target entry
length	int	Yes	Length of the data payload in bytes
fmt	MemoryFormat	Yes	Memory format of the cached data
dtype	Optional[torch.dtype]	Yes	Data type of the tensor (mapped to int for serialization)
shape	torch.Size	Yes	4-dimensional shape of the cached tensor
location	Optional[str]	No	Storage backend location identifier
num_groups	int	Yes	Number of KV layer groups (for init_remote_metadata_info)

Outputs

Name	Type	Description
serialize	bytes	Binary-encoded message ready for TCP transmission
deserialize	ClientMetaMessage or ServerMetaMessage	Deserialized message from raw bytes
packlength	int	Fixed byte size of the serialized message header
code	ServerReturnCode	SUCCESS (200) or FAIL (400) status

Usage Examples

from lmcache.v1.protocol import ClientMetaMessage, ClientCommand, ServerMetaMessage
from lmcache.v1.memory_management import MemoryFormat
import torch

# Create and serialize a client PUT request
msg = ClientMetaMessage(
    command=ClientCommand.PUT,
    key=cache_key,
    length=4096,
    fmt=MemoryFormat.KV_BLOB,
    dtype=torch.float16,
    shape=torch.Size([2, 32, 256, 128]),
)
data = msg.serialize()

# Deserialize a server response
response = ServerMetaMessage.deserialize(raw_bytes)
if response.code == ServerReturnCode.SUCCESS:
    print(f"Received {response.length} bytes")

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment