Implementation:Openai Openai agents python Image Tool Output Pattern
| Knowledge Sources | |
|---|---|
| Domains | Tool_Integration, Multimodal, Design_Pattern |
| Last Updated | 2026-02-11 00:00 GMT |
Overview
Demonstrates returning image data from a function tool using ToolOutputImage or ToolOutputImageDict, allowing the model to receive and describe images fetched by tools.
Description
The image_tool_output.py example showcases the SDK's support for multimodal tool outputs. When a function tool needs to return an image to the model (rather than plain text), it can use either ToolOutputImage (a typed class) or ToolOutputImageDict (a typed dictionary) to wrap the image URL along with a detail level parameter. The SDK automatically converts this output into the appropriate format for the model to process as visual input.
The example defines a fetch_random_image function tool decorated with @function_tool that returns either a ToolOutputImageDict (a dictionary with "type", "image_url", and "detail" keys) or a ToolOutputImage instance depending on a toggle flag. Both approaches produce the same result: the image URL is sent to the model, which can then describe or reason about the image content. The example uses a sample Unsplash image URL of a London cityscape.
This pattern is essential for building agents that interact with visual content, such as image analysis tools, screenshot-based workflows, or any scenario where a tool fetches or generates images that the model needs to interpret.
Usage
Use this pattern when building function tools that need to return images to the model for further processing. This is useful for image retrieval tools, screenshot capture tools, chart/graph generation, or any workflow where the model needs to see and describe visual output from a tool invocation.
Code Reference
Source Location
- Repository: Openai_Openai_agents_python
- File: examples/basic/image_tool_output.py
- Lines: 1-37
Signature
@function_tool
def fetch_random_image() -> ToolOutputImage | ToolOutputImageDict:
"""Fetch a random image."""
return ToolOutputImage(image_url=URL, detail="auto")
# or: return {"type": "image", "image_url": URL, "detail": "auto"}
Import
from agents import Agent, Runner, ToolOutputImage, ToolOutputImageDict, function_tool
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| (no parameters) | -- | -- | The fetch_random_image tool takes no arguments in this example |
Outputs
| Name | Type | Description |
|---|---|---|
| ToolOutputImage | ToolOutputImage | A typed object containing image_url (str) and detail ("auto", "low", or "high") |
| ToolOutputImageDict | dict | A TypedDict with keys "type" ("image"), "image_url" (str), and "detail" (str) |
| result.final_output | str | The model's text description of the fetched image |
Usage Examples
Return Image from a Function Tool (Typed Class)
from agents import Agent, Runner, ToolOutputImage, function_tool
URL = "https://images.unsplash.com/photo-1505761671935-60b3a7427bad?auto=format&fit=crop&w=400&q=80"
@function_tool
def fetch_random_image() -> ToolOutputImage:
"""Fetch a random image."""
return ToolOutputImage(image_url=URL, detail="auto")
agent = Agent(
name="Assistant",
instructions="You are a helpful assistant.",
tools=[fetch_random_image],
)
Return Image from a Function Tool (TypedDict)
from agents import Agent, Runner, ToolOutputImageDict, function_tool
@function_tool
def fetch_random_image() -> ToolOutputImageDict:
"""Fetch a random image."""
return {"type": "image", "image_url": URL, "detail": "auto"}
Run the Agent to Fetch and Describe an Image
import asyncio
async def main():
result = await Runner.run(
agent,
input="Fetch an image using the random_image tool, then describe it",
)
print(result.final_output)
# Output: "This image features the famous clock tower, commonly known as Big Ben, ..."
asyncio.run(main())