Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Environment:BerriAI Litellm Observability Stack

From Leeroopedia
Knowledge Sources
Domains Infrastructure, Observability
Last Updated 2026-02-15 16:00 GMT

Overview

Observability infrastructure environment for Prometheus metrics, OpenTelemetry tracing, and third-party logging integrations (LangSmith, Langfuse, DataDog).

Description

This environment covers the external observability services that LiteLLM can export telemetry data to. The primary integrations are Prometheus for metrics (request counts, latency histograms, cost tracking), OpenTelemetry for distributed tracing (spans for each LLM call), and various third-party platforms. The proxy server natively exposes a `/metrics` Prometheus endpoint and supports OTLP export via environment variables.

Usage

Use this environment when deploying the LiteLLM proxy in production and you need monitoring dashboards, distributed tracing, or cost analytics. Required by the Health_Check and Integration_Handlers implementations.

System Requirements

Category Requirement Notes
Software Prometheus server For scraping `/metrics` endpoint; 15-day retention recommended
Software OpenTelemetry Collector (optional) For OTLP trace/metric/log export
Network HTTP access to proxy `/metrics` Port 4000 by default

Dependencies

Python Packages

  • `prometheus-client` == 0.20.0 (for Prometheus metrics)
  • `opentelemetry-api` == 1.28.0 (for OTEL tracing)
  • `opentelemetry-sdk` == 1.28.0 (for OTEL SDK)
  • `langfuse` == 2.59.7 (optional, for Langfuse integration)
  • `ddtrace` == 2.19.0 (optional, for DataDog tracing)

Credentials

The following environment variables configure observability:

OpenTelemetry:

  • `OTEL_EXPORTER_OTLP_ENDPOINT`: OTLP collector endpoint URL.
  • `OTEL_EXPORTER_OTLP_HEADERS`: Authentication headers for OTLP export.
  • `OTEL_EXPORTER_OTLP_PROTOCOL`: Protocol (http/protobuf or grpc).
  • `OTEL_SERVICE_NAME`: Service name for traces.
  • `OTEL_ENVIRONMENT_NAME`: Environment tag for traces.
  • `OTEL_TRACER_NAME`: Custom tracer name (default: "litellm").
  • `DEBUG_OTEL`: Enable debug logging for OTEL integration.
  • `LITELLM_OTEL_INTEGRATION_ENABLE_METRICS`: Enable OTEL metrics export.
  • `LITELLM_OTEL_INTEGRATION_ENABLE_EVENTS`: Enable OTEL event logging.

Prometheus:

  • `PROMETHEUS_URL`: External Prometheus server URL for proxy dashboard queries.

Profiling:

  • `PYROSCOPE_APP_NAME`: Application name for Pyroscope continuous profiling.
  • `PYROSCOPE_SERVER_ADDRESS`: Pyroscope server URL.

Quick Install

# Install proxy with observability
pip install litellm prometheus-client opentelemetry-api opentelemetry-sdk

# Start Prometheus (Docker)
docker run -d --name prometheus -p 9090:9090 prom/prometheus

# Configure OTEL export
export OTEL_EXPORTER_OTLP_ENDPOINT="http://localhost:4318"
export OTEL_SERVICE_NAME="litellm-proxy"

Code Evidence

OpenTelemetry configuration from `litellm/integrations/opentelemetry.py:54-56`:

OTEL_TRACER_NAME = os.getenv("OTEL_TRACER_NAME", "litellm")
LITELLM_METER_NAME = os.getenv("LITELLM_METER_NAME", "litellm")
LITELLM_LOGGER_NAME = os.getenv("LITELLM_LOGGER_NAME", "litellm")

OTLP endpoint detection from `litellm/integrations/opentelemetry.py:104-127`:

OTEL_EXPORTER_OTLP_PROTOCOL = os.getenv("OTEL_EXPORTER_OTLP_PROTOCOL") or os.getenv("OTEL_EXPORTER")
OTEL_EXPORTER_OTLP_ENDPOINT = os.getenv("OTEL_EXPORTER_OTLP_ENDPOINT") or os.getenv("OTEL_ENDPOINT")
OTEL_EXPORTER_OTLP_HEADERS = os.getenv("OTEL_EXPORTER_OTLP_HEADERS") or os.getenv("OTEL_HEADERS")

Prometheus import guard from `litellm/integrations/prometheus_services.py`:

try:
    from prometheus_client import ...
except ImportError:
    raise ImportError("Missing prometheus_client. Run `pip install prometheus-client`")

Common Errors

Error Message Cause Solution
`Missing prometheus_client. Run pip install prometheus-client` prometheus-client not installed `pip install prometheus-client`
`OTEL export failed: connection refused` OTLP collector unreachable Verify `OTEL_EXPORTER_OTLP_ENDPOINT` and collector availability
`OpenTelemetry thread leak` OTEL SDK thread accumulation Ensure proper shutdown; LiteLLM includes thread leak prevention

Compatibility Notes

  • Prometheus: The proxy exposes metrics at `/metrics` endpoint. Configure Prometheus to scrape this endpoint (see `prometheus.yml` in repo root).
  • DataDog: Requires `ddtrace` package and DataDog Agent running. Uses `dd_tracing` module with no-op fallback.
  • Pyroscope: Continuous profiling is Linux/macOS only (`pyroscope-io` does not support Windows).

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment