Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Workflow:Arize ai Phoenix Trace Ingestion Pipeline

From Leeroopedia
Knowledge Sources
Domains AI_Observability, OpenTelemetry, Tracing
Last Updated 2026-02-14 06:00 GMT

Overview

End-to-end process for instrumenting LLM applications with OpenTelemetry-based tracing and ingesting spans into the Phoenix observability platform.

Description

This workflow covers the complete trace ingestion pipeline from client-side instrumentation to server-side persistence. It leverages the phoenix-otel package to configure OpenTelemetry trace providers with Phoenix-aware defaults, instruments LLM application code to emit spans with OpenInference semantic conventions, and transmits those spans to the Phoenix server via HTTP or gRPC protocols. The server decodes OTLP spans, extracts attributes (input/output values, model names, token counts), and persists them in the database for visualization and analysis.

Key capabilities:

  • Automatic instrumentation of popular LLM frameworks (OpenAI, LangChain, LlamaIndex, etc.)
  • Support for both HTTP/protobuf and gRPC transport protocols
  • Batch and simple span processing modes
  • OpenInference semantic conventions for LLM-specific attributes
  • Project-based organization of traces

Usage

Execute this workflow when you need to add observability to an LLM application. This is the foundational workflow for Phoenix: you have a running application that makes LLM calls (or uses frameworks like LangChain, LlamaIndex, DSPy) and you want to capture detailed traces of those interactions for debugging, performance analysis, or evaluation. The Phoenix server must be running and accessible at a known endpoint.

Execution Steps

Step 1: Install Dependencies

Install the required Phoenix packages for tracing. The arize-phoenix-otel package provides a lightweight wrapper around OpenTelemetry primitives with Phoenix-aware defaults. For auto-instrumentation of specific frameworks, install the corresponding openinference-instrumentation package.

Key considerations:

  • Use arize-phoenix-otel for the simplest setup experience
  • Install framework-specific instrumentors separately (e.g., openinference-instrumentation-openai for OpenAI)
  • The full arize-phoenix package includes the server; use sub-packages for client-only installations

Step 2: Register OpenTelemetry Tracer Provider

Call the register() function from phoenix.otel to configure the OpenTelemetry tracer provider. This sets up the exporter endpoint, transport protocol, span processor, and project name. The function returns a TracerProvider that can be used to create tracers.

Key considerations:

  • Configure the endpoint to point to your Phoenix server (default: http://localhost:6006)
  • Set project_name to organize traces by application or use case
  • Choose protocol between "http/protobuf" (default) and "grpc"
  • Enable batch mode for production workloads to reduce network overhead
  • Optionally pass an api_key for authenticated Phoenix deployments
  • Enable auto_instrument to automatically instrument installed OpenInference libraries

Step 3: Instrument Application Code

Use the tracer to create spans around LLM operations. Each span captures the operation name, timing, and attributes following the OpenInference semantic conventions. Spans can be nested to represent parent-child relationships (e.g., a chain calling multiple LLMs).

Key considerations:

  • Use openinference_span_kind to classify spans (e.g., "llm", "chain", "retriever", "embedding", "tool")
  • Set input and output values on spans using helper methods
  • Record LLM-specific attributes like model name, token counts, and status
  • Span context (trace_id, span_id) is automatically propagated through the application
  • For framework auto-instrumentation, spans are created automatically by the instrumentor

Step 4: Transmit Spans to Phoenix Server

Spans are automatically exported to the Phoenix server by the configured span processor. In SimpleSpanProcessor mode, spans are sent immediately upon completion. In BatchSpanProcessor mode, spans are batched and sent periodically for better throughput.

Key considerations:

  • HTTP exporter sends to the /v1/traces endpoint
  • gRPC exporter uses port 4317 by default
  • Both exporters support TLS configuration
  • Authentication is handled via API key headers
  • Failed exports are retried with exponential backoff

Step 5: Server Side Ingestion and Persistence

The Phoenix server receives OTLP span data, decodes it from protobuf format, extracts OpenInference attributes, and persists the spans in the database (SQLite or PostgreSQL). Traces become immediately available for querying via the GraphQL API and visualization in the Phoenix UI.

Key considerations:

  • The server supports both gRPC and HTTP ingestion endpoints
  • OTLP spans are decoded to Phoenix internal span schema
  • Attributes are flattened and indexed for efficient querying
  • Traces are organized by project name (from resource attributes)
  • The UI updates in near-real-time as new spans arrive

Execution Diagram

GitHub URL

Workflow Repository