Workflow:Microsoft Playwright AI agent driven testing
| Knowledge Sources | |
|---|---|
| Domains | AI_Testing, LLM_Integration, Test_Automation |
| Last Updated | 2026-02-11 22:00 GMT |
Overview
End-to-end process for writing and running AI-agent-driven browser tests that use large language models to perform page actions, verify expectations, and extract data from natural language instructions.
Description
This workflow covers Playwright's AI agent subsystem, which enables writing tests using natural language descriptions instead of explicit locator and action code. The agent uses an LLM (such as Anthropic Claude or OpenAI) to interpret high-level instructions (e.g., "add a todo item"), generate structured browser actions, execute them against the page, and verify outcomes. The system maintains conversation context across steps, captures page snapshots for the LLM to reason about, and supports three primary operations: perform (execute actions), expect (verify conditions), and extract (retrieve data).
Usage
Execute this workflow when you want to write tests as high-level business descriptions rather than detailed code, when testing complex UI flows where writing explicit selectors is impractical, or when you want to leverage AI for exploratory testing. Requires an LLM API key (e.g., Anthropic API key) configured in the environment.
Execution Steps
Step 1: Configure AI agent provider
Set up the LLM provider by configuring API credentials and model selection. The agent system supports multiple providers (Anthropic, OpenAI, etc.) and requires an API key to be available in the environment or test configuration. Configure agent options including the model name and chat API endpoint.
Key considerations:
- Set the appropriate API key environment variable (e.g., ANTHROPIC_API_KEY)
- Configure model selection in the Playwright config or test fixtures
- The agent system communicates with the LLM via a chat API interface
- Token budgets and context windows affect how much page content the agent can process
Step 2: Create test with agent fixture
Write test files that use the agent fixture provided by Playwright Test. The agent fixture wraps a browser page with AI capabilities. Structure tests using standard test() blocks but replace explicit locator/action code with agent method calls that accept natural language instructions.
Key considerations:
- Import and extend test fixtures to include the agent
- The agent operates on the current page context
- Tests can mix agent calls with traditional Playwright API calls
- Custom system prompts can guide the agent's behavior
Step 3: Define agent actions with perform
Use agent.perform() to instruct the AI to execute browser actions described in natural language. The agent takes a page snapshot, reasons about the current state, generates structured actions (click, fill, press, etc.), and executes them against the page. Multiple actions may be generated in a single perform call.
Key considerations:
- Instructions should be clear and specific about the desired action
- The agent captures ARIA snapshots to understand page structure
- Actions are validated against Zod schemas before execution
- The agent may retry or adjust actions based on page state changes
Step 4: Verify expectations with expect
Use agent.expect() to verify page conditions using natural language descriptions. The agent examines the current page state (DOM, ARIA tree, visual appearance) and determines whether the described condition is met. This replaces traditional assertion code with human-readable expectation statements.
Key considerations:
- Expectations should describe observable page state
- The agent evaluates truthfulness of the expectation against the live page
- Failed expectations produce descriptive error messages from the LLM
- Can verify complex visual or structural conditions that are hard to express with locators
Step 5: Extract data with extract
Use agent.extract() to pull structured data from the page based on natural language descriptions. The agent reads the page content and returns the requested information in the specified format. This is useful for data validation, comparing values, or feeding extracted data into subsequent test steps.
Key considerations:
- Specify what data to extract and in what format
- The agent parses visible page content to fulfill extraction requests
- Extracted data can be used in subsequent assertions or test logic
- Works well for dynamic content where selectors may be fragile
Step 6: Execute and analyze results
Run the AI-agent-driven tests using the standard Playwright Test runner. Review results including the actions the agent performed, any LLM reasoning traces, and test pass/fail outcomes. Traces capture both the agent's decision-making process and the browser state at each step.
Key considerations:
- Agent tests may be slower due to LLM API latency
- Token usage and API costs scale with test complexity and page size
- Trace output includes agent action logs for debugging
- Non-determinism from the LLM may cause occasional test flakiness