Implementation:Microsoft Playwright Page Agent
| Knowledge Sources | |
|---|---|
| Domains | AI_Testing, Browser_Automation, LLM_Configuration |
| Last Updated | 2026-02-11 00:00 GMT |
Overview
Concrete API for creating an AI-powered browser automation agent from a Playwright Page instance, provided by the Playwright library.
Description
The page.agent() method is the primary entry point for AI-agent-driven testing in Playwright. It creates a PageAgent instance that bridges a browser page with an LLM provider, enabling natural language control of the browser. The method accepts configuration for the LLM provider (API type, credentials, model), operational limits (token budgets, action caps, retry policies), an optional response cache, secrets for sensitive data, and a system prompt override.
Internally, page.agent() constructs a client-side PageAgent proxy that communicates with a PageAgentDispatcher on the server side. The server-side dispatcher manages the agentic loop, tool execution, and communication with the LLM provider. This client-server architecture ensures that the agent can operate across Playwright's process boundary (e.g., when using remote browsers).
The method resolves provider configuration by merging defaults with the supplied options. If no provider is specified, it falls back to environment variables (e.g., OPENAI_API_KEY, ANTHROPIC_API_KEY) to auto-detect the provider.
Usage
Use page.agent() when:
- You want to create an AI agent that can autonomously interact with a web page
- You need to configure a specific LLM provider and model for browser automation
- You are building custom test utilities outside the standard test fixture system
- You require fine-grained control over agent limits, caching, or system prompts
Code Reference
Source Location
- Repository: playwright
- File:
packages/playwright-core/src/client/page.ts:L836-857 - File:
packages/playwright/src/index.ts:L458-497
Signature
page.agent(options?: {
provider?: {
api: "openai" | "openai-compatible" | "anthropic" | "google";
apiKey: string;
model: string;
apiEndpoint?: string;
apiTimeout?: number;
};
cache?: AgentCache;
limits?: {
maxTokens?: number;
maxActions?: number; // default: 10
maxActionRetries?: number; // default: 3
};
secrets?: { name: string; value: string }[];
systemPrompt?: string;
}): Promise<PageAgent>
Import
// PageAgent is obtained from a Page instance, not imported directly
import { chromium } from 'playwright';
const browser = await chromium.launch();
const page = await browser.newPage();
const agent = await page.agent({
provider: {
api: 'openai',
apiKey: process.env.OPENAI_API_KEY,
model: 'gpt-4o',
},
});
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| provider.api | "openai-compatible" | "anthropic" | "google" | Yes (or via env) | The LLM provider API type to use |
| provider.apiKey | string |
Yes (or via env) | API key for authenticating with the LLM provider |
| provider.model | string |
Yes (or via env) | The specific model identifier (e.g., "gpt-4o", "claude-sonnet-4-20250514") |
| provider.apiEndpoint | string |
No | Custom API endpoint URL for self-hosted or proxy deployments |
| provider.apiTimeout | number |
No | Request timeout in milliseconds for LLM API calls |
| cache | AgentCache |
No | Cache instance for storing and replaying LLM responses |
| limits.maxTokens | number |
No | Maximum total tokens the agent may consume |
| limits.maxActions | number |
No | Maximum browser actions per perform() call (default: 10) |
| limits.maxActionRetries | number |
No | Maximum retries on action failure (default: 3) |
| secrets | { name: string; value: string }[] |
No | Sensitive values (e.g., passwords) passed securely to the agent |
| systemPrompt | string |
No | Custom system prompt to override the default agent instructions |
Outputs
| Name | Type | Description |
|---|---|---|
| PageAgent | PageAgent |
A client-side proxy to the server-side PageAgentDispatcher, exposing perform(), expect(), and extract() methods for AI-driven browser interaction |
Usage Examples
Basic Example
import { chromium } from 'playwright';
const browser = await chromium.launch({ headless: true });
const page = await browser.newPage();
// Create an agent with OpenAI provider
const agent = await page.agent({
provider: {
api: 'openai',
apiKey: process.env.OPENAI_API_KEY,
model: 'gpt-4o',
},
limits: {
maxActions: 15,
maxTokens: 50000,
},
});
await agent.perform('Navigate to https://example.com and click the "More information" link');
await agent.expect('The page contains information about IANA');
await browser.close();
Advanced Example with Anthropic Provider
import { chromium } from 'playwright';
const browser = await chromium.launch();
const page = await browser.newPage();
const agent = await page.agent({
provider: {
api: 'anthropic',
apiKey: process.env.ANTHROPIC_API_KEY,
model: 'claude-sonnet-4-20250514',
},
limits: {
maxActions: 20,
maxActionRetries: 5,
maxTokens: 100000,
},
secrets: [
{ name: 'TEST_PASSWORD', value: process.env.TEST_PASSWORD },
],
systemPrompt: 'You are a meticulous QA tester. Always verify actions succeeded before proceeding.',
});
await agent.perform('Log in with username "testuser" and password TEST_PASSWORD');
Example with Google Gemini Provider
import { chromium } from 'playwright';
const browser = await chromium.launch();
const page = await browser.newPage();
const agent = await page.agent({
provider: {
api: 'google',
apiKey: process.env.GOOGLE_API_KEY,
model: 'gemini-2.0-flash',
},
});
await agent.perform('Search for "Playwright testing" and click the first result');