Principle:Arize ai Phoenix OTel Tracer Registration
| Knowledge Sources | |
|---|---|
| Domains | AI Observability, OpenTelemetry, Distributed Tracing |
| Last Updated | 2026-02-14 00:00 GMT |
Overview
OpenTelemetry tracer provider configuration for LLM observability establishes the central tracing context that governs how spans are created, processed, and exported from an instrumented application to a telemetry collector.
Description
In the OpenTelemetry model, a TracerProvider is the entry point for all tracing operations. It holds configuration state including:
- Resource attributes: Metadata attached to every span produced by the provider, such as the project name that groups traces in the collector.
- SpanProcessors: Pipeline components that receive completed spans and forward them to exporters (either synchronously or in batches).
- Global registration: Optionally setting the provider as the process-wide default so that any library calling
opentelemetry.trace.get_tracer()uses it automatically.
The tracer registration principle focuses on the one-time setup step that wires together the provider, processor, and exporter into a functioning pipeline. This is typically the first thing an application does at startup, before any traced work begins.
For LLM observability specifically, registration also involves:
- Associating a project name with the provider so that traces are routed to the correct project in the collector (Phoenix uses the
openinference.project.nameresource attribute). - Choosing between simple and batch span processing based on throughput requirements.
- Optionally enabling auto-instrumentation to discover and activate all installed OpenInference library instrumentors (e.g., for OpenAI, LangChain, LlamaIndex) via Python entry points.
- Configuring authentication via API keys or custom headers for secured collector endpoints.
Usage
Use this principle whenever:
- Starting an application that needs to send LLM traces to a collector.
- Configuring tracing at the entry point of a web service, CLI tool, or notebook.
- Deciding between simple (synchronous, low-latency) and batch (asynchronous, high-throughput) span processing strategies.
- Setting up auto-instrumentation to avoid manually instrumenting each LLM library call.
Theoretical Basis
The tracer registration process follows the OpenTelemetry SDK specification for provider initialization:
1. Create Resource with project_name attribute
2. Instantiate TracerProvider with Resource
3. Create SpanExporter targeting collector endpoint
4. Create SpanProcessor (Simple or Batch) wrapping the exporter
5. Add SpanProcessor to TracerProvider
6. (Optional) Set TracerProvider as global default
7. (Optional) Auto-instrument installed libraries
The distinction between SimpleSpanProcessor and BatchSpanProcessor reflects a fundamental tradeoff:
| Processor | Behavior | Best For |
|---|---|---|
| SimpleSpanProcessor | Exports each span synchronously as it completes | Development, debugging, low-volume workloads |
| BatchSpanProcessor | Buffers spans and exports in batches on a schedule | Production, high-volume workloads |
Global provider registration (via opentelemetry.trace.set_tracer_provider()) is the mechanism by which third-party instrumentation libraries obtain access to the configured provider without explicit dependency injection. When auto-instrumentation is enabled, the registration function discovers instrumentors through Python's entry_points mechanism under the openinference_instrumentor group.
Configuration Resolution Order
Parameters follow a precedence chain:
1. Explicit function arguments (highest priority)
2. Environment variables (PHOENIX_COLLECTOR_ENDPOINT, PHOENIX_PROJECT_NAME, etc.)
3. Built-in defaults (endpoint=http://localhost:6006, project_name="default")
This layered configuration approach allows the same code to run unchanged across development, staging, and production environments by varying only the environment variables.