Principle:Openai Openai agents python Guardrail Execution
Overview
Guardrail execution describes the runtime pipeline that the OpenAI Agents SDK uses to automatically invoke configured guardrails at the appropriate lifecycle points and handle their results. The framework manages the timing, ordering, and error handling of guardrail invocations so that developers only need to define guardrail functions and attach them to agents or tools.
Core Theory
Lifecycle Points
Guardrails execute at four distinct points in the agent run lifecycle:
- Agent input guardrails -- Run on the first turn only, for the starting agent only. They validate the user's input before or in parallel with the first LLM call.
- Tool input guardrails -- Run before each tool invocation, after the LLM has decided to call a tool but before the tool function executes.
- Tool output guardrails -- Run after each tool invocation, after the tool function has returned but before the output is sent back to the LLM.
- Agent output guardrails -- Run after the final agent produces its terminal output.
Input Guardrail Execution
When a run begins, the framework checks the starting agent's input_guardrails list. For each guardrail:
- If
run_in_parallel=True(the default), the guardrail is launched concurrently with the first model call. Both the guardrail and the model run simultaneously. - If
run_in_parallel=False, the guardrail executes first. The model call is blocked until all sequential guardrails complete.
After all input guardrails finish, the framework inspects their results. If any guardrail set tripwire_triggered=True, an InputGuardrailTripwireTriggered exception is raised immediately. The model response (if it has completed) is discarded.
Input guardrail results are captured in RunResult.input_guardrail_results, making them available for inspection after the run completes (even in the success case).
Tool Guardrail Execution
When the LLM produces a tool call, the framework processes it through the tool guardrail pipeline:
- Input guardrails first -- Each
ToolInputGuardrailattached to the tool receives aToolInputGuardrailDatacontaining the tool context (with call arguments) and the agent. If any input guardrail returnsreject_content, the tool is not executed and the rejection message is sent to the LLM as the tool result. If any returnsraise_exception, aToolInputGuardrailTripwireTriggeredexception halts the run. - Tool execution -- If all input guardrails allow, the tool function runs.
- Output guardrails next -- Each
ToolOutputGuardrailreceives aToolOutputGuardrailDatacontaining the context, agent, and the tool's return value. If any output guardrail returnsreject_content, the rejection message replaces the actual tool output in what the LLM sees. If any returnsraise_exception, aToolOutputGuardrailTripwireTriggeredexception halts the run.
Output Guardrail Execution
After the final agent in the run chain produces its terminal response, the framework checks that agent's output_guardrails list. Each output guardrail receives the agent's output for validation. If any guardrail triggers its tripwire, an OutputGuardrailTripwireTriggered exception is raised.
Output guardrail results are captured in RunResult.output_guardrail_results.
Exception Types
The framework defines four specific exception types for guardrail failures, each carrying the guardrail result that triggered it:
InputGuardrailTripwireTriggered-- Raised when an agent input guardrail's tripwire is triggered. Containsguardrail_result: InputGuardrailResult.OutputGuardrailTripwireTriggered-- Raised when an agent output guardrail's tripwire is triggered. Containsguardrail_result: OutputGuardrailResult.ToolInputGuardrailTripwireTriggered-- Raised when a tool input guardrail raises an exception. Containsguardrail_result: ToolInputGuardrailResult.ToolOutputGuardrailTripwireTriggered-- Raised when a tool output guardrail raises an exception. Containsguardrail_result: ToolOutputGuardrailResult.
All four exceptions inherit from AgentsException and can be caught individually or as a group.
Error Handling Strategy
The exception-based design enables a clear error handling pattern:
- Catch specific exceptions to handle different guardrail failures differently (e.g., log tool guardrail failures but alert on agent guardrail failures).
- Catch the base
AgentsExceptionto handle all guardrail failures uniformly. - Let exceptions propagate to halt the application when guardrail failures indicate a serious problem.
The guardrail result attached to each exception provides diagnostic information about what triggered the failure, enabling logging, metrics, and debugging.
Result Capture
Regardless of whether guardrails trigger or not, their results are captured:
RunResult.input_guardrail_results-- List ofInputGuardrailResultfrom agent input guardrails.RunResult.output_guardrail_results-- List ofOutputGuardrailResultfrom agent output guardrails.
These results include the guardrail name, the GuardrailFunctionOutput returned by the function, and whether the tripwire was triggered. This data is valuable for monitoring and auditing guardrail behavior in production.
Key Source References
- Internal guardrail execution functions:
src/agents/run_internal/guardrails.pylines 102-168 - Exception definitions:
src/agents/exceptions.pylines 78-119
Import
from agents.exceptions import (
InputGuardrailTripwireTriggered,
OutputGuardrailTripwireTriggered,
ToolInputGuardrailTripwireTriggered,
ToolOutputGuardrailTripwireTriggered,
)
See Also
- Implementation:Openai_Openai_agents_python_Run_Guardrails
- Tool Input Guardrail Definition -- Pre-execution validation on tool inputs
- Tool Output Guardrail Definition -- Post-execution validation on tool outputs
- Agent Level Guardrail Definition -- Guardrails on agent input and output
- Guardrail Attachment -- How to wire guardrails to tools and agents