Environment:BerriAI Litellm Observability Stack
| Knowledge Sources | |
|---|---|
| Domains | Infrastructure, Observability |
| Last Updated | 2026-02-15 16:00 GMT |
Overview
Observability infrastructure environment for Prometheus metrics, OpenTelemetry tracing, and third-party logging integrations (LangSmith, Langfuse, DataDog).
Description
This environment covers the external observability services that LiteLLM can export telemetry data to. The primary integrations are Prometheus for metrics (request counts, latency histograms, cost tracking), OpenTelemetry for distributed tracing (spans for each LLM call), and various third-party platforms. The proxy server natively exposes a `/metrics` Prometheus endpoint and supports OTLP export via environment variables.
Usage
Use this environment when deploying the LiteLLM proxy in production and you need monitoring dashboards, distributed tracing, or cost analytics. Required by the Health_Check and Integration_Handlers implementations.
System Requirements
| Category | Requirement | Notes |
|---|---|---|
| Software | Prometheus server | For scraping `/metrics` endpoint; 15-day retention recommended |
| Software | OpenTelemetry Collector (optional) | For OTLP trace/metric/log export |
| Network | HTTP access to proxy `/metrics` | Port 4000 by default |
Dependencies
Python Packages
- `prometheus-client` == 0.20.0 (for Prometheus metrics)
- `opentelemetry-api` == 1.28.0 (for OTEL tracing)
- `opentelemetry-sdk` == 1.28.0 (for OTEL SDK)
- `langfuse` == 2.59.7 (optional, for Langfuse integration)
- `ddtrace` == 2.19.0 (optional, for DataDog tracing)
Credentials
The following environment variables configure observability:
OpenTelemetry:
- `OTEL_EXPORTER_OTLP_ENDPOINT`: OTLP collector endpoint URL.
- `OTEL_EXPORTER_OTLP_HEADERS`: Authentication headers for OTLP export.
- `OTEL_EXPORTER_OTLP_PROTOCOL`: Protocol (http/protobuf or grpc).
- `OTEL_SERVICE_NAME`: Service name for traces.
- `OTEL_ENVIRONMENT_NAME`: Environment tag for traces.
- `OTEL_TRACER_NAME`: Custom tracer name (default: "litellm").
- `DEBUG_OTEL`: Enable debug logging for OTEL integration.
- `LITELLM_OTEL_INTEGRATION_ENABLE_METRICS`: Enable OTEL metrics export.
- `LITELLM_OTEL_INTEGRATION_ENABLE_EVENTS`: Enable OTEL event logging.
Prometheus:
- `PROMETHEUS_URL`: External Prometheus server URL for proxy dashboard queries.
Profiling:
- `PYROSCOPE_APP_NAME`: Application name for Pyroscope continuous profiling.
- `PYROSCOPE_SERVER_ADDRESS`: Pyroscope server URL.
Quick Install
# Install proxy with observability
pip install litellm prometheus-client opentelemetry-api opentelemetry-sdk
# Start Prometheus (Docker)
docker run -d --name prometheus -p 9090:9090 prom/prometheus
# Configure OTEL export
export OTEL_EXPORTER_OTLP_ENDPOINT="http://localhost:4318"
export OTEL_SERVICE_NAME="litellm-proxy"
Code Evidence
OpenTelemetry configuration from `litellm/integrations/opentelemetry.py:54-56`:
OTEL_TRACER_NAME = os.getenv("OTEL_TRACER_NAME", "litellm")
LITELLM_METER_NAME = os.getenv("LITELLM_METER_NAME", "litellm")
LITELLM_LOGGER_NAME = os.getenv("LITELLM_LOGGER_NAME", "litellm")
OTLP endpoint detection from `litellm/integrations/opentelemetry.py:104-127`:
OTEL_EXPORTER_OTLP_PROTOCOL = os.getenv("OTEL_EXPORTER_OTLP_PROTOCOL") or os.getenv("OTEL_EXPORTER")
OTEL_EXPORTER_OTLP_ENDPOINT = os.getenv("OTEL_EXPORTER_OTLP_ENDPOINT") or os.getenv("OTEL_ENDPOINT")
OTEL_EXPORTER_OTLP_HEADERS = os.getenv("OTEL_EXPORTER_OTLP_HEADERS") or os.getenv("OTEL_HEADERS")
Prometheus import guard from `litellm/integrations/prometheus_services.py`:
try:
from prometheus_client import ...
except ImportError:
raise ImportError("Missing prometheus_client. Run `pip install prometheus-client`")
Common Errors
| Error Message | Cause | Solution |
|---|---|---|
| `Missing prometheus_client. Run pip install prometheus-client` | prometheus-client not installed | `pip install prometheus-client` |
| `OTEL export failed: connection refused` | OTLP collector unreachable | Verify `OTEL_EXPORTER_OTLP_ENDPOINT` and collector availability |
| `OpenTelemetry thread leak` | OTEL SDK thread accumulation | Ensure proper shutdown; LiteLLM includes thread leak prevention |
Compatibility Notes
- Prometheus: The proxy exposes metrics at `/metrics` endpoint. Configure Prometheus to scrape this endpoint (see `prometheus.yml` in repo root).
- DataDog: Requires `ddtrace` package and DataDog Agent running. Uses `dd_tracing` module with no-op fallback.
- Pyroscope: Continuous profiling is Linux/macOS only (`pyroscope-io` does not support Windows).