Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:LMCache LMCache Offload Server Interface

From Leeroopedia


Knowledge Sources
Domains KV Cache, Offloading
Last Updated 2026-02-09 00:00 GMT

Overview

OffloadServerInterface is an abstract base class defining the contract for KV cache offload server implementations.

Description

This interface establishes two abstract methods that all offload server implementations must provide: offload for triggering cache data offload given a set of hashes, slot mappings, and offsets, and close for resource cleanup. The interface is intentionally minimal to allow different transport backends (such as ZMQ or shared memory) to implement the offload logic in their own way. Offloading moves KV cache data from GPU to a secondary storage tier managed by the LMCache engine.

Usage

Use this interface as the base class when implementing a new offload server transport. Consumers should program against this interface to remain decoupled from specific transport implementations.

Code Reference

Source Location

Signature

class OffloadServerInterface(metaclass=abc.ABCMeta):
    @abc.abstractmethod
    def offload(
        self,
        hashes: List[int],
        slot_mapping: List[int],
        offsets: List[int],
    ) -> bool: ...

    @abc.abstractmethod
    def close(self) -> None: ...

Import

from lmcache.v1.offload_server.abstract_server import OffloadServerInterface

I/O Contract

Inputs

Name Type Required Description
hashes List[int] Yes The chunk hashes identifying the data to offload
slot_mapping List[int] Yes The slot IDs of the data in GPU memory to offload
offsets List[int] Yes Number of tokens in each block being offloaded

Outputs

Name Type Description
offload result bool Whether the offload operation was successful

Usage Examples

from lmcache.v1.offload_server.abstract_server import OffloadServerInterface

# Implement a custom offload server
class MyOffloadServer(OffloadServerInterface):
    def offload(self, hashes, slot_mapping, offsets) -> bool:
        # Custom offload logic
        return True

    def close(self) -> None:
        # Clean up resources
        pass

server = MyOffloadServer()
success = server.offload(hashes=[123, 456], slot_mapping=[0, 1], offsets=[256, 256])
server.close()

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment