Implementation:LMCache LMCache LMCacheConnectorV1Impl Init

Knowledge Sources	LMCache vLLM
Domains	Infrastructure, Serving
Last Updated	2026-02-09 00:00 GMT

Overview

Concrete tool for initializing the LMCache connector within vLLM's KV transfer framework, provided by the lmcache integration module.

Description

The LMCacheConnectorV1Impl class is the core implementation behind LMCacheConnectorV1Dynamic (the vLLM-facing entry point). Its __init__ method orchestrates the full LMCache initialization: loading config, creating the LMCacheManager (which builds the cache engine and storage backends), starting services, and initializing connector-specific state (blender for CacheBlend, layer tracking, chunk/block settings).

Usage

This class is instantiated automatically when vLLM creates the connector specified in KVTransferConfig. Users do not call this directly; instead, they launch vLLM with --kv-transfer-config specifying kv_connector="LMCacheConnectorV1".

Code Reference

Source Location

Repository: LMCache
File: lmcache/integration/vllm/vllm_v1_adapter.py
Lines: L433-L562

Signature

class LMCacheConnectorV1Impl:
    def __init__(
        self,
        vllm_config: VllmConfig,
        role: KVConnectorRole,
        parent: KVConnectorBase_V1,
    ):
        """
        Args:
            vllm_config: Full vLLM configuration including model, parallel, cache configs
            role: SCHEDULER or WORKER - determines which operations are available
            parent: The KVConnectorBase_V1 wrapper (LMCacheConnectorV1Dynamic)
        """

Import

from lmcache.integration.vllm.vllm_v1_adapter import LMCacheConnectorV1Impl

I/O Contract

Inputs

Name	Type	Required	Description
vllm_config	VllmConfig	Yes	Full vLLM configuration object
role	KVConnectorRole	Yes	SCHEDULER or WORKER role
parent	KVConnectorBase_V1	Yes	The Dynamic wrapper for delegation

Outputs

Name	Type	Description
self	LMCacheConnectorV1Impl	Initialized connector with LMCacheManager, cache engine, storage backends, and optional blender

Usage Examples

vLLM CLI Launch

# Launch vLLM with LMCache connector for local KV cache offloading
export LMCACHE_LOCAL_CPU=True
export LMCACHE_MAX_LOCAL_CPU_SIZE=5

vllm serve meta-llama/Llama-3.1-8B-Instruct \
    --kv-transfer-config '{"kv_connector": "LMCacheConnectorV1", "kv_role": "kv_both"}'

Programmatic Usage

from vllm import LLM
from vllm.config import KVTransferConfig

ktc = KVTransferConfig(
    kv_connector="LMCacheConnectorV1",
    kv_role="kv_both",
)

llm = LLM(
    model="meta-llama/Llama-3.1-8B-Instruct",
    kv_transfer_config=ktc,
    gpu_memory_utilization=0.8,
)

Related Pages

Implements Principle

Principle:LMCache_LMCache_VLLM_KV_Connector_Integration

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment