Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:LMCache LMCache Controller Observability

From Leeroopedia


Knowledge Sources
Domains Observability, Cache Controller
Last Updated 2026-02-09 00:00 GMT

Overview

Provides Prometheus-based metrics collection and socket message counting for monitoring the LMCache cache controller.

Description

This module defines two classes for cache controller observability. PrometheusLogger is a singleton that initializes and manages Prometheus Gauge metrics for the cache controller, including KV pool key counts, registered worker counts, socket message counts and pending status for both PULL and REPLY sockets, active request counts, sequence number discontinuity counts, and full sync progress metrics. Metrics are created with configurable labels and support livemostrecent multiprocess mode. SocketMetricsContext is a context manager that tracks message counts and active requests for a given socket type, automatically incrementing counters on entry and decrementing active requests on exit, with error logging on exceptions.

Usage

Use PrometheusLogger.GetOrCreate at controller startup to initialize metrics with appropriate labels. Access the singleton later via GetInstance or GetInstanceOrNone. Use SocketMetricsContext around socket message processing loops to track throughput and active request counts per socket type.

Code Reference

Source Location

Signature

class SocketType(Enum):
    PULL = "pull"
    REPLY = "reply"

class PrometheusLogger:
    def __init__(self, labels: dict) -> None: ...
    @staticmethod
    def GetOrCreate(labels: dict) -> "PrometheusLogger": ...
    @staticmethod
    def GetInstance() -> "PrometheusLogger": ...
    @staticmethod
    def GetInstanceOrNone() -> Optional["PrometheusLogger"]: ...
    @staticmethod
    def DestroyInstance() -> None: ...
    @staticmethod
    def unregister_all_metrics() -> None: ...

class SocketMetricsContext:
    def __init__(self, manager, socket_type: SocketType, message_count: int = 1) -> None: ...
    def __enter__(self) -> "SocketMetricsContext": ...
    def __exit__(self, exc_type, exc_val, exc_tb) -> bool: ...

Import

from lmcache.v1.cache_controller.observability import (
    PrometheusLogger,
    SocketType,
    SocketMetricsContext,
)

I/O Contract

Inputs

Name Type Required Description
labels dict Yes Dictionary of label key-value pairs for Prometheus metric dimensions
manager object Yes Object whose attributes will be updated for socket counting (SocketMetricsContext)
socket_type SocketType Yes Which socket's metrics to update (PULL or REPLY)
message_count int No Number of messages to count per context entry (default: 1)

Outputs

Name Type Description
PrometheusLogger PrometheusLogger Singleton instance providing Prometheus gauge metrics for the controller
Prometheus Gauges prometheus_client.Gauge Individual metrics: kv_pool_keys_count, registered_workers_count, socket message/pending/active metrics, full sync metrics

Usage Examples

from lmcache.v1.cache_controller.observability import (
    PrometheusLogger,
    SocketMetricsContext,
    SocketType,
)

# Initialize at controller startup
prom = PrometheusLogger.GetOrCreate(labels={"controller_id": "ctrl-01"})

# Set dynamic metric functions
prom.kv_pool_keys_count.set_function(lambda: len(kv_pool))
prom.registered_workers_count.set_function(lambda: worker_count)

# Track socket processing metrics
with SocketMetricsContext(manager_instance, SocketType.PULL):
    # Process a PULL socket message
    handle_pull_message(msg)

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment