Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Openai Openai agents python Guardrail Execution

From Leeroopedia

Overview

Guardrail execution describes the runtime pipeline that the OpenAI Agents SDK uses to automatically invoke configured guardrails at the appropriate lifecycle points and handle their results. The framework manages the timing, ordering, and error handling of guardrail invocations so that developers only need to define guardrail functions and attach them to agents or tools.

Core Theory

Lifecycle Points

Guardrails execute at four distinct points in the agent run lifecycle:

  1. Agent input guardrails -- Run on the first turn only, for the starting agent only. They validate the user's input before or in parallel with the first LLM call.
  2. Tool input guardrails -- Run before each tool invocation, after the LLM has decided to call a tool but before the tool function executes.
  3. Tool output guardrails -- Run after each tool invocation, after the tool function has returned but before the output is sent back to the LLM.
  4. Agent output guardrails -- Run after the final agent produces its terminal output.

Input Guardrail Execution

When a run begins, the framework checks the starting agent's input_guardrails list. For each guardrail:

  • If run_in_parallel=True (the default), the guardrail is launched concurrently with the first model call. Both the guardrail and the model run simultaneously.
  • If run_in_parallel=False, the guardrail executes first. The model call is blocked until all sequential guardrails complete.

After all input guardrails finish, the framework inspects their results. If any guardrail set tripwire_triggered=True, an InputGuardrailTripwireTriggered exception is raised immediately. The model response (if it has completed) is discarded.

Input guardrail results are captured in RunResult.input_guardrail_results, making them available for inspection after the run completes (even in the success case).

Tool Guardrail Execution

When the LLM produces a tool call, the framework processes it through the tool guardrail pipeline:

  1. Input guardrails first -- Each ToolInputGuardrail attached to the tool receives a ToolInputGuardrailData containing the tool context (with call arguments) and the agent. If any input guardrail returns reject_content, the tool is not executed and the rejection message is sent to the LLM as the tool result. If any returns raise_exception, a ToolInputGuardrailTripwireTriggered exception halts the run.
  2. Tool execution -- If all input guardrails allow, the tool function runs.
  3. Output guardrails next -- Each ToolOutputGuardrail receives a ToolOutputGuardrailData containing the context, agent, and the tool's return value. If any output guardrail returns reject_content, the rejection message replaces the actual tool output in what the LLM sees. If any returns raise_exception, a ToolOutputGuardrailTripwireTriggered exception halts the run.

Output Guardrail Execution

After the final agent in the run chain produces its terminal response, the framework checks that agent's output_guardrails list. Each output guardrail receives the agent's output for validation. If any guardrail triggers its tripwire, an OutputGuardrailTripwireTriggered exception is raised.

Output guardrail results are captured in RunResult.output_guardrail_results.

Exception Types

The framework defines four specific exception types for guardrail failures, each carrying the guardrail result that triggered it:

  • InputGuardrailTripwireTriggered -- Raised when an agent input guardrail's tripwire is triggered. Contains guardrail_result: InputGuardrailResult.
  • OutputGuardrailTripwireTriggered -- Raised when an agent output guardrail's tripwire is triggered. Contains guardrail_result: OutputGuardrailResult.
  • ToolInputGuardrailTripwireTriggered -- Raised when a tool input guardrail raises an exception. Contains guardrail_result: ToolInputGuardrailResult.
  • ToolOutputGuardrailTripwireTriggered -- Raised when a tool output guardrail raises an exception. Contains guardrail_result: ToolOutputGuardrailResult.

All four exceptions inherit from AgentsException and can be caught individually or as a group.

Error Handling Strategy

The exception-based design enables a clear error handling pattern:

  • Catch specific exceptions to handle different guardrail failures differently (e.g., log tool guardrail failures but alert on agent guardrail failures).
  • Catch the base AgentsException to handle all guardrail failures uniformly.
  • Let exceptions propagate to halt the application when guardrail failures indicate a serious problem.

The guardrail result attached to each exception provides diagnostic information about what triggered the failure, enabling logging, metrics, and debugging.

Result Capture

Regardless of whether guardrails trigger or not, their results are captured:

  • RunResult.input_guardrail_results -- List of InputGuardrailResult from agent input guardrails.
  • RunResult.output_guardrail_results -- List of OutputGuardrailResult from agent output guardrails.

These results include the guardrail name, the GuardrailFunctionOutput returned by the function, and whether the tripwire was triggered. This data is valuable for monitoring and auditing guardrail behavior in production.

Key Source References

  • Internal guardrail execution functions: src/agents/run_internal/guardrails.py lines 102-168
  • Exception definitions: src/agents/exceptions.py lines 78-119

Import

from agents.exceptions import (
    InputGuardrailTripwireTriggered,
    OutputGuardrailTripwireTriggered,
    ToolInputGuardrailTripwireTriggered,
    ToolOutputGuardrailTripwireTriggered,
)

See Also

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment