Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Environment:LMCache LMCache Python Runtime

From Leeroopedia


Knowledge Sources
Domains Infrastructure, Python
Last Updated 2026-02-09 00:00 GMT

Overview

Python 3.10-3.13 runtime environment with core dependencies including PyTorch, ZeroMQ, Redis, and FastAPI.

Description

This environment defines the Python runtime and core package dependencies required by all LMCache workflows. LMCache requires Python >= 3.10 and < 3.14, with specific constraints: NIXL is only available for Python < 3.13, and numpy is capped at <= 2.2.6 due to numba/NIXL compatibility. The runtime depends on PyTorch (version flexible at runtime, pinned to 2.8.0 at build time), PyZMQ >= 25.0.0 for inter-process communication, Redis for remote cache backends, FastAPI/uvicorn for proxy and API servers, and transformers >= 4.51.1 for model configuration introspection.

Usage

Use this environment for all LMCache deployments. Every workflow (KV Cache Offloading, Disaggregated Prefill, P2P KV Cache Sharing, CacheBlend KV Reuse) requires this base Python runtime. The specific Python version affects NIXL availability (< 3.13 required for NIXL support).

System Requirements

Category Requirement Notes
Python >= 3.10, < 3.14 CPython only (classifiers list 3.10-3.13)
OS Linux (POSIX) Primary supported platform
Disk Minimal Package installation only

Dependencies

System Packages

  • Python 3.10, 3.11, 3.12, or 3.13 (CPython)

Python Packages

  • `torch` (flexible version at runtime; build requires `torch==2.8.0`)
  • `transformers` >= 4.51.1
  • `pyzmq` >= 25.0.0
  • `numpy` <= 2.2.6
  • `redis`
  • `fastapi`
  • `uvicorn`
  • `httptools`
  • `aiohttp`
  • `httpx`
  • `pyyaml`
  • `safetensors`
  • `prometheus_client` >= 0.18.0
  • `psutil`
  • `py-cpuinfo`
  • `msgspec`
  • `sortedcontainers`
  • `setuptools` >= 77.0.3, < 81.0.0
  • `setuptools_scm` >= 8
  • `nixl` (only for `python_version < "3.13"`)
  • `cupy-cuda12x`
  • `cufile-python`
  • `awscrt`
  • `aiofile`
  • `aiofiles`
  • `nvtx` (optional, with fallback)

Credentials

The following environment variables are used at runtime:

  • `LMCACHE_CONFIG_FILE`: Path to YAML configuration file (primary config mechanism).
  • `LMCACHE_FORCE_SKIP_SAVE`: Set to any truthy value to skip all cache save operations.
  • `LMCACHE_OFFLOAD_RPC_PORT`: Port for offload RPC server (default: 100).
  • `PROMETHEUS_MULTIPROC_DIR`: Directory for Prometheus multiprocess metrics (default: `/tmp/lmcache_prometheus`).

Quick Install

# Install from PyPI
pip install lmcache

# Install from source (recommended)
pip install -r requirements/build.txt
pip install -e . --no-build-isolation

# For CUDA-specific extras
pip install -r requirements/cuda.txt

Code Evidence

Python version constraint from `pyproject.toml:42`:

requires-python = ">=3.10,<3.14"

NIXL Python version constraint from `requirements/common.txt:10`:

# if nixl decides to support >=3.13 in the future, we can remove this constraint
nixl; python_version < "3.13"

Numpy version constraint from `requirements/common.txt:12`:

# nixl uses numba which requires numpy<=2.2.6
numpy<=2.2.6

Environment variable usage from `lmcache/integration/vllm/utils.py:54`:

config_file = os.environ.get("LMCACHE_CONFIG_FILE")

Common Errors

Error Message Cause Solution
`ImportError: nixl` Python >= 3.13 or nixl not installed Use Python < 3.13 or disable NIXL-dependent features
`ModuleNotFoundError: lmcache._version` Package not installed via setuptools_scm Install with `pip install -e .` or `pip install lmcache`
numpy version conflict numpy > 2.2.6 installed `pip install "numpy<=2.2.6"`
`ImportError: nvtx` nvtx profiling library not installed Optional; system falls back to dummy decorator automatically

Compatibility Notes

  • Python 3.13: NIXL is not available. All NIXL-dependent features (Disaggregated Prefill, P2P sharing) will not function.
  • Torch version flexibility: Runtime does not pin torch, allowing serving engines (vLLM, SGLang) to control the version. Build-time pins `torch==2.8.0` for wheel releases.
  • Docker: The Dockerfile may override torch version independently of `requirements/common.txt`.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment