Implementation:Microsoft Playwright Page Agent

Knowledge Sources	Playwright Playwright AI Testing
Domains	AI_Testing, Browser_Automation, LLM_Configuration
Last Updated	2026-02-11 00:00 GMT

Overview

Concrete API for creating an AI-powered browser automation agent from a Playwright Page instance, provided by the Playwright library.

Description

The page.agent() method is the primary entry point for AI-agent-driven testing in Playwright. It creates a PageAgent instance that bridges a browser page with an LLM provider, enabling natural language control of the browser. The method accepts configuration for the LLM provider (API type, credentials, model), operational limits (token budgets, action caps, retry policies), an optional response cache, secrets for sensitive data, and a system prompt override.

Internally, page.agent() constructs a client-side PageAgent proxy that communicates with a PageAgentDispatcher on the server side. The server-side dispatcher manages the agentic loop, tool execution, and communication with the LLM provider. This client-server architecture ensures that the agent can operate across Playwright's process boundary (e.g., when using remote browsers).

The method resolves provider configuration by merging defaults with the supplied options. If no provider is specified, it falls back to environment variables (e.g., OPENAI_API_KEY, ANTHROPIC_API_KEY) to auto-detect the provider.

Usage

Use page.agent() when:

You want to create an AI agent that can autonomously interact with a web page
You need to configure a specific LLM provider and model for browser automation
You are building custom test utilities outside the standard test fixture system
You require fine-grained control over agent limits, caching, or system prompts

Code Reference

Source Location

Repository: playwright
File: packages/playwright-core/src/client/page.ts:L836-857
File: packages/playwright/src/index.ts:L458-497

Signature

page.agent(options?: {
  provider?: {
    api: "openai" | "openai-compatible" | "anthropic" | "google";
    apiKey: string;
    model: string;
    apiEndpoint?: string;
    apiTimeout?: number;
  };
  cache?: AgentCache;
  limits?: {
    maxTokens?: number;
    maxActions?: number;       // default: 10
    maxActionRetries?: number; // default: 3
  };
  secrets?: { name: string; value: string }[];
  systemPrompt?: string;
}): Promise<PageAgent>

Import

// PageAgent is obtained from a Page instance, not imported directly
import { chromium } from 'playwright';

const browser = await chromium.launch();
const page = await browser.newPage();
const agent = await page.agent({
  provider: {
    api: 'openai',
    apiKey: process.env.OPENAI_API_KEY,
    model: 'gpt-4o',
  },
});

I/O Contract

Inputs

Name	Type	Required	Description
provider.api	"openai-compatible" \| "anthropic" \| "google"	Yes (or via env)	The LLM provider API type to use
provider.apiKey	`string`	Yes (or via env)	API key for authenticating with the LLM provider
provider.model	`string`	Yes (or via env)	The specific model identifier (e.g., "gpt-4o", "claude-sonnet-4-20250514")
provider.apiEndpoint	`string`	No	Custom API endpoint URL for self-hosted or proxy deployments
provider.apiTimeout	`number`	No	Request timeout in milliseconds for LLM API calls
cache	`AgentCache`	No	Cache instance for storing and replaying LLM responses
limits.maxTokens	`number`	No	Maximum total tokens the agent may consume
limits.maxActions	`number`	No	Maximum browser actions per perform() call (default: 10)
limits.maxActionRetries	`number`	No	Maximum retries on action failure (default: 3)
secrets	`{ name: string; value: string }[]`	No	Sensitive values (e.g., passwords) passed securely to the agent
systemPrompt	`string`	No	Custom system prompt to override the default agent instructions

Outputs

Name	Type	Description
PageAgent	`PageAgent`	A client-side proxy to the server-side PageAgentDispatcher, exposing perform(), expect(), and extract() methods for AI-driven browser interaction

Usage Examples

Basic Example

import { chromium } from 'playwright';

const browser = await chromium.launch({ headless: true });
const page = await browser.newPage();

// Create an agent with OpenAI provider
const agent = await page.agent({
  provider: {
    api: 'openai',
    apiKey: process.env.OPENAI_API_KEY,
    model: 'gpt-4o',
  },
  limits: {
    maxActions: 15,
    maxTokens: 50000,
  },
});

await agent.perform('Navigate to https://example.com and click the "More information" link');
await agent.expect('The page contains information about IANA');

await browser.close();

Advanced Example with Anthropic Provider

import { chromium } from 'playwright';

const browser = await chromium.launch();
const page = await browser.newPage();

const agent = await page.agent({
  provider: {
    api: 'anthropic',
    apiKey: process.env.ANTHROPIC_API_KEY,
    model: 'claude-sonnet-4-20250514',
  },
  limits: {
    maxActions: 20,
    maxActionRetries: 5,
    maxTokens: 100000,
  },
  secrets: [
    { name: 'TEST_PASSWORD', value: process.env.TEST_PASSWORD },
  ],
  systemPrompt: 'You are a meticulous QA tester. Always verify actions succeeded before proceeding.',
});

await agent.perform('Log in with username "testuser" and password TEST_PASSWORD');

Example with Google Gemini Provider

import { chromium } from 'playwright';

const browser = await chromium.launch();
const page = await browser.newPage();

const agent = await page.agent({
  provider: {
    api: 'google',
    apiKey: process.env.GOOGLE_API_KEY,
    model: 'gemini-2.0-flash',
  },
});

await agent.perform('Search for "Playwright testing" and click the first result');

Related Pages

Implements Principle

Principle:Microsoft_Playwright_Configure_AI_Agent_Provider

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment