Implementation:LMCache LMCache Offload Server Interface
| Knowledge Sources | |
|---|---|
| Domains | KV Cache, Offloading |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
OffloadServerInterface is an abstract base class defining the contract for KV cache offload server implementations.
Description
This interface establishes two abstract methods that all offload server implementations must provide: offload for triggering cache data offload given a set of hashes, slot mappings, and offsets, and close for resource cleanup. The interface is intentionally minimal to allow different transport backends (such as ZMQ or shared memory) to implement the offload logic in their own way. Offloading moves KV cache data from GPU to a secondary storage tier managed by the LMCache engine.
Usage
Use this interface as the base class when implementing a new offload server transport. Consumers should program against this interface to remain decoupled from specific transport implementations.
Code Reference
Source Location
- Repository: LMCache
- File: lmcache/v1/offload_server/abstract_server.py
- Lines: 1-37
Signature
class OffloadServerInterface(metaclass=abc.ABCMeta):
@abc.abstractmethod
def offload(
self,
hashes: List[int],
slot_mapping: List[int],
offsets: List[int],
) -> bool: ...
@abc.abstractmethod
def close(self) -> None: ...
Import
from lmcache.v1.offload_server.abstract_server import OffloadServerInterface
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| hashes | List[int] | Yes | The chunk hashes identifying the data to offload |
| slot_mapping | List[int] | Yes | The slot IDs of the data in GPU memory to offload |
| offsets | List[int] | Yes | Number of tokens in each block being offloaded |
Outputs
| Name | Type | Description |
|---|---|---|
| offload result | bool | Whether the offload operation was successful |
Usage Examples
from lmcache.v1.offload_server.abstract_server import OffloadServerInterface
# Implement a custom offload server
class MyOffloadServer(OffloadServerInterface):
def offload(self, hashes, slot_mapping, offsets) -> bool:
# Custom offload logic
return True
def close(self) -> None:
# Clean up resources
pass
server = MyOffloadServer()
success = server.offload(hashes=[123, 456], slot_mapping=[0, 1], offsets=[256, 256])
server.close()