Implementation:LMCache LMCache LMCacheConnectorV1Impl Init
| Knowledge Sources | |
|---|---|
| Domains | Infrastructure, Serving |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
Concrete tool for initializing the LMCache connector within vLLM's KV transfer framework, provided by the lmcache integration module.
Description
The LMCacheConnectorV1Impl class is the core implementation behind LMCacheConnectorV1Dynamic (the vLLM-facing entry point). Its __init__ method orchestrates the full LMCache initialization: loading config, creating the LMCacheManager (which builds the cache engine and storage backends), starting services, and initializing connector-specific state (blender for CacheBlend, layer tracking, chunk/block settings).
Usage
This class is instantiated automatically when vLLM creates the connector specified in KVTransferConfig. Users do not call this directly; instead, they launch vLLM with --kv-transfer-config specifying kv_connector="LMCacheConnectorV1".
Code Reference
Source Location
- Repository: LMCache
- File: lmcache/integration/vllm/vllm_v1_adapter.py
- Lines: L433-L562
Signature
class LMCacheConnectorV1Impl:
def __init__(
self,
vllm_config: VllmConfig,
role: KVConnectorRole,
parent: KVConnectorBase_V1,
):
"""
Args:
vllm_config: Full vLLM configuration including model, parallel, cache configs
role: SCHEDULER or WORKER - determines which operations are available
parent: The KVConnectorBase_V1 wrapper (LMCacheConnectorV1Dynamic)
"""
Import
from lmcache.integration.vllm.vllm_v1_adapter import LMCacheConnectorV1Impl
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| vllm_config | VllmConfig | Yes | Full vLLM configuration object |
| role | KVConnectorRole | Yes | SCHEDULER or WORKER role |
| parent | KVConnectorBase_V1 | Yes | The Dynamic wrapper for delegation |
Outputs
| Name | Type | Description |
|---|---|---|
| self | LMCacheConnectorV1Impl | Initialized connector with LMCacheManager, cache engine, storage backends, and optional blender |
Usage Examples
vLLM CLI Launch
# Launch vLLM with LMCache connector for local KV cache offloading
export LMCACHE_LOCAL_CPU=True
export LMCACHE_MAX_LOCAL_CPU_SIZE=5
vllm serve meta-llama/Llama-3.1-8B-Instruct \
--kv-transfer-config '{"kv_connector": "LMCacheConnectorV1", "kv_role": "kv_both"}'
Programmatic Usage
from vllm import LLM
from vllm.config import KVTransferConfig
ktc = KVTransferConfig(
kv_connector="LMCacheConnectorV1",
kv_role="kv_both",
)
llm = LLM(
model="meta-llama/Llama-3.1-8B-Instruct",
kv_transfer_config=ktc,
gpu_memory_utilization=0.8,
)