Principle:Arize ai Phoenix Application Instrumentation
| Knowledge Sources | |
|---|---|
| Domains | AI Observability, OpenTelemetry, LLM Instrumentation |
| Last Updated | 2026-02-14 00:00 GMT |
Overview
Instrumenting LLM applications with tracing captures the execution flow, timing, inputs, outputs, and metadata of every operation so that developers can observe, debug, and optimize their AI systems.
Description
Application instrumentation is the practice of inserting tracing hooks into application code so that each meaningful unit of work produces a span -- a structured record of an operation's name, duration, attributes, and hierarchical relationship to other spans within a trace.
In the context of LLM observability, instrumentation serves several purposes:
- Visibility: Capturing the chain of LLM calls, retrieval steps, tool invocations, and agent decisions that comprise a complex AI workflow.
- Debugging: Allowing developers to inspect individual span attributes (e.g., the prompt sent to an LLM, the response received, token counts) when something goes wrong.
- Performance analysis: Measuring latency at each step to identify bottlenecks in multi-step pipelines.
- Cost tracking: Recording token usage per span to attribute costs to specific operations.
There are two complementary approaches to instrumentation:
Manual Instrumentation
The developer explicitly creates spans around code blocks using the OpenTelemetry tracing API:
tracer = tracer_provider.get_tracer(__name__)
with tracer.start_as_current_span("operation-name"):
# ... application code ...
This approach offers fine-grained control over span boundaries, names, and attributes. It is suitable for custom application logic that is not covered by pre-built instrumentors.
Automatic Instrumentation
Pre-built instrumentor libraries hook into specific LLM frameworks (OpenAI, LangChain, LlamaIndex, etc.) and automatically create spans for every API call. These instrumentors are discovered at runtime through Python's entry point mechanism under the openinference_instrumentor group.
Automatic instrumentation requires no code changes beyond the initial registration call. It captures framework-specific attributes (model names, token counts, prompt templates) using the OpenInference semantic conventions.
Usage
Use this principle whenever:
- Building an LLM application that needs observability for debugging and monitoring.
- Deciding between manual and automatic instrumentation strategies.
- Wrapping custom business logic in spans to appear in the trace waterfall alongside auto-instrumented framework calls.
- Setting up a new project where multiple LLM libraries need to be traced consistently.
Theoretical Basis
The instrumentation model follows the OpenTelemetry span hierarchy:
Trace
+-- Root Span (e.g., "user-request")
+-- Child Span (e.g., "llm-call")
| +-- Attributes: model, input, output, token_count
+-- Child Span (e.g., "retrieval")
| +-- Attributes: query, documents, scores
+-- Child Span (e.g., "tool-call")
+-- Attributes: tool_name, parameters, result
Each span is associated with a SpanContext containing a trace_id (shared across all spans in the trace) and a span_id (unique to the span). Parent-child relationships are established through the OpenTelemetry context propagation mechanism: when a new span is started "as current," it automatically becomes the parent of any spans created within its scope.
Manual vs. Automatic Tradeoffs
| Aspect | Manual Instrumentation | Automatic Instrumentation |
|---|---|---|
| Setup effort | Requires explicit code at each trace point | One-time registration with auto_instrument=True
|
| Granularity | Full control over span names and attributes | Predetermined by the instrumentor library |
| Framework coverage | Works with any code | Limited to libraries with OpenInference instrumentors |
| Maintenance | Must update when application code changes | Instrumentors update independently |
| Custom attributes | Added directly via the span API | Requires post-processing or custom instrumentor extensions |
Auto-Instrumentation Discovery
When auto-instrumentation is enabled, the system uses Python's importlib.metadata.entry_points() to discover all packages that register an entry point under the openinference_instrumentor group. Each discovered instrumentor class is instantiated and its instrument(tracer_provider=...) method is called, which monkey-patches the target library to produce spans automatically.