Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Openai Openai agents python ComputerTool Pattern

From Leeroopedia
Knowledge Sources
Domains Tool_Integration, Computer_Use, Browser_Automation
Last Updated 2026-02-11 00:00 GMT

Overview

Demonstrates the ComputerTool with a Playwright-based AsyncComputer implementation that enables an agent to interact with a browser through screenshot, click, type, scroll, keypress, drag, and other computer actions.

Description

This implementation provides a complete example of using the ComputerTool hosted tool with a local Playwright browser. The core component is LocalPlaywrightComputer, a concrete subclass of AsyncComputer that implements all required computer actions: screenshot, click, double_click, scroll, type, wait, move, keypress, and drag. The class manages the Playwright lifecycle via async context manager methods (__aenter__ / __aexit__) and also provides explicit open() / close() methods for non-context-manager usage.

The example demonstrates two distinct patterns for providing a computer to the agent. The singleton pattern creates a single shared LocalPlaywrightComputer instance using async with and passes it directly to ComputerTool. The per-request pattern uses a ComputerProvider with create and dispose callbacks, so each agent run gets its own isolated browser instance. This is important when running multiple agents concurrently to avoid shared state conflicts.

The agent must use the computer-use-preview model and requires ModelSettings(truncation="auto") as mandated by the computer use API. A key mapping dictionary (CUA_KEY_TO_PLAYWRIGHT_KEY) translates model-emitted key names to Playwright-compatible key identifiers for the keypress action. Screenshots are captured as base64-encoded PNG data from the browser viewport.

Usage

Use this pattern when building agents that need to visually interact with web applications or browser-based interfaces. The singleton pattern is suitable for simple sequential workflows, while the per-request ComputerProvider pattern is recommended for production deployments where concurrent agent runs must be isolated from each other.

Code Reference

Source Location

Signature

class LocalPlaywrightComputer(AsyncComputer):
    @property
    def environment(self) -> Environment:
        return "browser"

    @property
    def dimensions(self) -> tuple[int, int]:
        return (1024, 768)

    async def screenshot(self) -> str: ...
    async def click(self, x: int, y: int, button: Button = "left") -> None: ...
    async def double_click(self, x: int, y: int) -> None: ...
    async def scroll(self, x: int, y: int, scroll_x: int, scroll_y: int) -> None: ...
    async def type(self, text: str) -> None: ...
    async def wait(self) -> None: ...
    async def move(self, x: int, y: int) -> None: ...
    async def keypress(self, keys: list[str]) -> None: ...
    async def drag(self, path: list[tuple[int, int]]) -> None: ...

Import

from playwright.async_api import Browser, Page, Playwright, async_playwright

from agents import (
    Agent,
    AsyncComputer,
    Button,
    ComputerProvider,
    ComputerTool,
    Environment,
    ModelSettings,
    RunContextWrapper,
    Runner,
    trace,
)

I/O Contract

Inputs

Name Type Required Description
computer ComputerProvider Yes Either a direct AsyncComputer instance (singleton) or a ComputerProvider with create/dispose callbacks (per-request).
model str Yes Must be set to "computer-use-preview" on the agent.
model_settings.truncation str Yes Must be set to "auto" as required by the computer use API.
environment Environment Yes (property) The environment type, returned as "browser" by the implementation.
dimensions tuple[int, int] Yes (property) Viewport dimensions, returned as (1024, 768) by the implementation.

Outputs

Name Type Description
result.final_output str The final text output from the agent after completing the computer use task.
screenshot() str Base64-encoded PNG screenshot of the current browser viewport.

Usage Examples

Singleton Computer (Shared Instance)

import asyncio
from agents import Agent, ComputerTool, ModelSettings, Runner, trace

async def main():
    async with LocalPlaywrightComputer() as computer:
        with trace("Computer use example"):
            agent = Agent(
                name="Browser user",
                instructions="You are a helpful agent.",
                tools=[ComputerTool(computer=computer)],
                model="computer-use-preview",
                model_settings=ModelSettings(truncation="auto"),
            )
            result = await Runner.run(agent, "What is the weather in Tokyo?")
            print(result.final_output)

asyncio.run(main())

Per-Request Computer (Isolated Instances)

import asyncio
from typing import Any
from agents import Agent, ComputerProvider, ComputerTool, ModelSettings, RunContextWrapper, Runner, trace

async def create_computer(*, run_context: RunContextWrapper[Any]) -> LocalPlaywrightComputer:
    return await LocalPlaywrightComputer().open()

async def dispose_computer(
    *, run_context: RunContextWrapper[Any], computer: LocalPlaywrightComputer
) -> None:
    await computer.close()

async def main():
    provider = ComputerProvider[LocalPlaywrightComputer](
        create=create_computer,
        dispose=dispose_computer,
    )

    with trace("Computer use example"):
        agent = Agent(
            name="Browser user",
            instructions="You are a helpful agent.",
            tools=[ComputerTool(computer=provider)],
            model="computer-use-preview",
            model_settings=ModelSettings(truncation="auto"),
        )
        result = await Runner.run(agent, "Search for OpenAI news")
        print(result.final_output)

asyncio.run(main())

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment