Principle:Confident ai Deepeval OpenAI Agents Instrumentation
| Knowledge Sources | |
|---|---|
| Domains | |
| Last Updated | 2026-02-14 09:00 GMT |
Overview
A design principle for instrumenting OpenAI Agents SDK via tracing processors. The OpenAI Agents library exposes a TracingProcessor interface that receives span lifecycle events (start and end) during agent execution, enabling external systems to capture and analyze execution traces.
Description
The OpenAI Agents SDK provides a built-in tracing infrastructure where agent execution is decomposed into spans representing discrete operations (model calls, tool invocations, handoffs, guardrail checks). The SDK allows external tracing processors to be registered via RunConfig, which receive notifications when spans are created and completed.
DeepEval's integration with the OpenAI Agents SDK implements the TracingProcessor interface to:
- Capture span lifecycle events -- receive
on_span_startandon_span_endcallbacks for every span in the agent execution. - Build hierarchical traces -- reconstruct the full execution tree from individual span events, preserving parent-child relationships.
- Enable evaluation -- translate the captured spans into DeepEval's trace format for metric evaluation.
This approach differs from both callback-based (LangChain) and OTEL-based (PydanticAI) instrumentation in that it uses the OpenAI Agents SDK's own tracing abstraction as the integration point.
Usage
OpenAI Agents instrumentation is used when:
- An agent built with the OpenAI Agents SDK needs to be evaluated using DeepEval metrics.
- Developers want to capture execution traces from OpenAI Agents for analysis.
- The tracing processor interface provides the cleanest integration point for the OpenAI Agents execution model.
The general pattern is:
OPENAI_AGENTS_INSTRUMENTATION(agent A):
1. CREATE a tracing processor implementing the TracingProcessor interface
2. REGISTER the processor via RunConfig(tracing_processors=[processor])
3. On each span start event:
a. RECORD span metadata (type, name, parent span)
b. BEGIN tracking the span in the trace hierarchy
4. On each span end event:
a. CAPTURE span results and duration
b. FINALIZE the span in the trace hierarchy
5. On execution completion:
a. ASSEMBLE the complete trace from accumulated spans
Theoretical Basis
This principle is grounded in:
- Tracing processor pattern -- a pipeline pattern where processors receive span events and can transform, filter, or forward them. This is analogous to middleware chains in web frameworks.
- Span lifecycle management -- spans follow a well-defined lifecycle (creation, execution, completion) that processors observe without modifying. This ensures instrumentation does not alter agent behavior.
The tracing processor pattern provides a clean separation between execution and observation: the agent SDK manages span creation and lifecycle, while the processor purely observes and records. This minimizes the risk of instrumentation side effects.