Principle:BerriAI Litellm Integration Data Export
| Knowledge Sources | Domains | Last Updated |
|---|---|---|
| [[1]] | Observability, Telemetry Export | 2026-02-15 |
Overview
Integration data export is the process of transforming a standardized logging payload into a provider-specific format and transmitting it to an external monitoring platform.
Description
After the logging payload has been constructed in a canonical format, it must be exported to each registered observability backend. Each backend has its own data model, authentication scheme, transport protocol, and batching requirements. The integration data export layer bridges the gap between the internal standardized payload and each external platform's expectations.
Key challenges this pattern addresses:
- Protocol diversity -- OpenTelemetry uses gRPC or HTTP with protobuf; Prometheus uses a pull-based scrape model with counters, histograms, and gauges; Langsmith uses a REST API with batch POST requests; Datadog uses its own intake API.
- Schema mapping -- Each platform expects data organized differently. OpenTelemetry uses spans with attributes; Prometheus uses labeled metrics; Langsmith uses run objects with inputs/outputs.
- Batching and performance -- Some integrations (Langsmith, Datadog) benefit from batching multiple events into a single HTTP request. Others (Prometheus) require immediate metric updates. The export layer must accommodate both patterns.
- Credential resolution -- Per-request dynamic credentials may override global environment variables, enabling multi-tenant logging scenarios.
- Failure isolation -- A failure in one integration's export must not prevent other integrations from receiving data. Each handler runs independently.
Usage
Apply integration data export when:
- Implementing a new observability backend handler that needs to receive LLM telemetry.
- Configuring which external platforms receive data from a LiteLLM deployment.
- Debugging why data is not appearing in a specific monitoring platform.
- Optimizing the performance of telemetry export (e.g., tuning batch sizes or flush intervals).
Theoretical Basis
Adapter Pattern
Each integration handler is an adapter that translates the universal StandardLoggingPayload into the platform's native format:
StandardLoggingPayload
|
+--[OpenTelemetryHandler]--> OTel Span with semantic attributes
|
+--[PrometheusHandler]----> Counter/Histogram/Gauge metric updates
|
+--[LangsmithHandler]----> Run object with inputs, outputs, metadata
|
+--[DatadogHandler]------> DD Log entry with tags and measures
|
+--[GenericAPIHandler]---> HTTP POST with JSON body
Handler Dispatch (Pseudocode)
function dispatch_to_handlers(event_type, kwargs, response, start_time, end_time):
payload = kwargs["standard_logging_object"] -- StandardLoggingPayload
for handler in registered_callbacks:
try:
if event_type == "success":
if is_async:
await handler.async_log_success_event(kwargs, response, start_time, end_time)
else:
handler.log_success_event(kwargs, response, start_time, end_time)
elif event_type == "failure":
if is_async:
await handler.async_log_failure_event(kwargs, response, start_time, end_time)
else:
handler.log_failure_event(kwargs, response, start_time, end_time)
except Exception:
log_warning("Handler failed, continuing to next")
continue -- isolate failures
Batching Strategy
For high-throughput deployments, individual per-call exports create excessive network overhead. The batching strategy:
1. On each success event:
batch_queue.append(formatted_event)
if length(batch_queue) >= BATCH_SIZE:
flush_queue()
2. Periodic timer (every FLUSH_INTERVAL seconds):
if batch_queue is not empty:
flush_queue()
3. flush_queue():
acquire lock
batch = copy(batch_queue)
batch_queue.clear()
release lock
send_batch(batch) -- single HTTP request with all events
Metric Export vs. Event Export
Two fundamentally different export models exist:
- Event-based (Langfuse, Langsmith, Datadog Logs): Each LLM call produces a discrete log entry or trace. Data is pushed.
- Metric-based (Prometheus): Each LLM call increments counters and updates histograms. Data is pulled by scraping.
Both models consume the same StandardLoggingPayload but extract different information from it.