Implementation:LMCache LMCache Offload Server Interface

Knowledge Sources	LMCache
Domains	KV Cache, Offloading
Last Updated	2026-02-09 00:00 GMT

Overview

OffloadServerInterface is an abstract base class defining the contract for KV cache offload server implementations.

Description

This interface establishes two abstract methods that all offload server implementations must provide: offload for triggering cache data offload given a set of hashes, slot mappings, and offsets, and close for resource cleanup. The interface is intentionally minimal to allow different transport backends (such as ZMQ or shared memory) to implement the offload logic in their own way. Offloading moves KV cache data from GPU to a secondary storage tier managed by the LMCache engine.

Usage

Use this interface as the base class when implementing a new offload server transport. Consumers should program against this interface to remain decoupled from specific transport implementations.

Code Reference

Source Location

Repository: LMCache
File: lmcache/v1/offload_server/abstract_server.py
Lines: 1-37

Signature

class OffloadServerInterface(metaclass=abc.ABCMeta):
    @abc.abstractmethod
    def offload(
        self,
        hashes: List[int],
        slot_mapping: List[int],
        offsets: List[int],
    ) -> bool: ...

    @abc.abstractmethod
    def close(self) -> None: ...

Import

from lmcache.v1.offload_server.abstract_server import OffloadServerInterface

I/O Contract

Inputs

Name	Type	Required	Description
hashes	List[int]	Yes	The chunk hashes identifying the data to offload
slot_mapping	List[int]	Yes	The slot IDs of the data in GPU memory to offload
offsets	List[int]	Yes	Number of tokens in each block being offloaded

Outputs

Name	Type	Description
offload result	bool	Whether the offload operation was successful

Usage Examples

from lmcache.v1.offload_server.abstract_server import OffloadServerInterface

# Implement a custom offload server
class MyOffloadServer(OffloadServerInterface):
    def offload(self, hashes, slot_mapping, offsets) -> bool:
        # Custom offload logic
        return True

    def close(self) -> None:
        # Clean up resources
        pass

server = MyOffloadServer()
success = server.offload(hashes=[123, 456], slot_mapping=[0, 1], offsets=[256, 256])
server.close()

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment