Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:LMCache LMCache Infinistore Connector

From Leeroopedia


Knowledge Sources
Domains Storage Backend, RDMA Networking
Last Updated 2026-02-09 00:00 GMT

Overview

InfinistoreConnector provides RDMA-based remote KV cache storage using the Infinistore library.

Description

The InfinistoreConnector class extends RemoteConnector to store and retrieve KV cache data over RDMA via the infinistore library. It pre-allocates a configurable number of send and receive buffers (default 16, each 40 MB), registers them as memory regions with the RDMA connection, and manages buffer availability through async queues. Data serialization uses RemoteMetadata headers (28 bytes) prepended to tensor byte payloads. Retrieved tensors are deep-copied into pinned memory allocated by the LocalCPUBackend. All I/O operations (get, put, exists) support async execution through the event loop.

Usage

Use this connector when deploying LMCache with an Infinistore server over RDMA for high-throughput, low-latency remote KV cache access. It requires RDMA-capable hardware and the infinistore Python package. Configure via a URL like infinistore://host:port?device=mlx5_0.

Code Reference

Source Location

Signature

class InfinistoreConnector(RemoteConnector):
    def __init__(
        self,
        host: str,
        port: int,
        dev_name: str,
        link_type: str,
        loop: asyncio.AbstractEventLoop,
        memory_allocator: LocalCPUBackend,
    ) -> None: ...
    async def exists(self, key: CacheEngineKey) -> bool: ...
    def exists_sync(self, key: CacheEngineKey) -> bool: ...
    async def get(self, key: CacheEngineKey) -> Optional[MemoryObj]: ...
    async def put(self, key: CacheEngineKey, memory_obj: MemoryObj) -> None: ...
    async def list(self) -> List[str]: ...
    async def close(self) -> None: ...

Import

from lmcache.v1.storage_backend.connector.infinistore_connector import InfinistoreConnector

I/O Contract

Inputs

Name Type Required Description
host str Yes Infinistore server hostname or IP address
port int Yes Infinistore server port
dev_name str Yes RDMA device name (e.g., "mlx5_0")
link_type str Yes RDMA link type
loop asyncio.AbstractEventLoop Yes Event loop for async operations
memory_allocator LocalCPUBackend Yes Allocator for pinned CPU memory used during get operations
key CacheEngineKey Yes (for get/put/exists) Cache key identifying the KV chunk
memory_obj MemoryObj Yes (for put) Memory object containing the tensor data to store

Outputs

Name Type Description
exists return bool Whether the key exists in the Infinistore server
get return Optional[MemoryObj] The retrieved memory object with tensor data copied to pinned memory, or None on failure
put return None Data is written to the Infinistore server asynchronously via RDMA

Usage Examples

from lmcache.v1.storage_backend.connector.infinistore_connector import InfinistoreConnector

# Create connector with RDMA parameters
connector = InfinistoreConnector(
    host="192.168.1.10",
    port=12345,
    dev_name="mlx5_0",
    link_type="Ethernet",
    loop=asyncio.get_event_loop(),
    memory_allocator=local_cpu_backend,
)

# Store a KV cache chunk
await connector.put(cache_key, memory_obj)

# Check existence
exists = await connector.exists(cache_key)

# Retrieve the chunk
result = await connector.get(cache_key)

# Close the connection
await connector.close()

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment