Implementation:CrewAIInc CrewAI Stagehand Tool
| Knowledge Sources | |
|---|---|
| Domains | Browser Automation, Web Interaction, Tool Integration |
| Last Updated | 2026-02-11 00:00 GMT |
Overview
StagehandTool is a CrewAI tool that automates web browser interactions using natural language instructions through the Stagehand framework and Browserbase infrastructure, supporting actions, navigation, data extraction, and element observation.
Description
The tool extends BaseTool and provides the most sophisticated browser automation capabilities in the CrewAI toolkit. It conditionally imports the Stagehand SDK, falling back to type stubs and mock classes if unavailable.
The tool supports four command types:
- act (default) -- Perform atomic browser actions like clicking buttons, filling forms, typing text, and scrolling. Multi-step instructions are parsed from numbered steps ("Step 1:", "Step 2:") or semicolon-separated instructions and executed sequentially with retry logic.
- navigate -- Navigate directly to a URL using page.goto.
- extract -- Extract structured data from web pages using ExtractOptions with text extraction mode.
- observe -- Identify and analyze visible elements on a page using ObserveOptions with visibility filtering.
Key implementation features:
- Intelligent action decomposition -- The _extract_steps method parses complex instructions into atomic steps for reliable execution.
- Retry with simplification -- Failed actions are retried with simplified instructions via _simplify_instruction, which reduces complex instructions to basic action primitives.
- Multi-model support -- Automatically detects the appropriate API key (OpenAI, Anthropic, or Google) based on the configured model name.
- Session management -- Manages Browserbase sessions with lazy initialization, configurable DOM settle timeout, self-healing, and CAPTCHA solving.
- Async/sync bridging -- The synchronous _run method bridges to the async _async_run using event loop detection and ThreadPoolExecutor fallback.
- Resource cleanup -- Handles cleanup via close(), __del__, and context manager protocol (__enter__/__exit__).
- Testing mode -- Built-in MockPage and MockStagehand classes enable unit testing without real browser sessions.
The StagehandResult Pydantic model encapsulates operation outcomes with success, data, and error fields.
Usage
Use this tool when agents need to interact with websites beyond simple HTTP requests -- filling forms, clicking buttons, navigating multi-page flows, extracting dynamic content, or observing page elements. It is ideal for web scraping of JavaScript-heavy sites, form automation, and complex web workflows.
Code Reference
Source Location
- Repository: CrewAI
- File: lib/crewai-tools/src/crewai_tools/tools/stagehand_tool/stagehand_tool.py
- Lines: 1-739
Signature
class StagehandTool(BaseTool):
name: str = "Web Automation Tool"
description: str = "Use this tool to control a web browser and interact with websites using natural language."
args_schema: type[BaseModel] = StagehandToolSchema
package_dependencies: list[str] = ["stagehand<=0.5.9"]
# Stagehand configuration
api_key: str | None = None
project_id: str | None = None
model_api_key: str | None = None
model_name: AvailableModel | None = AvailableModel.CLAUDE_3_7_SONNET_LATEST
server_url: str | None = "https://api.stagehand.browserbase.com/v1"
headless: bool = False
dom_settle_timeout_ms: int = 3000
self_heal: bool = True
wait_for_captcha_solves: bool = True
verbose: int = 1
max_retries_on_token_limit: int = 3
use_simplified_dom: bool = True
def __init__(
self,
api_key: str | None = None,
project_id: str | None = None,
model_api_key: str | None = None,
model_name: str | None = None,
server_url: str | None = None,
session_id: str | None = None,
headless: bool | None = None,
dom_settle_timeout_ms: int | None = None,
self_heal: bool | None = None,
wait_for_captcha_solves: bool | None = None,
verbose: int | None = None,
_testing: bool = False,
**kwargs,
): ...
Import
from crewai_tools.tools.stagehand_tool.stagehand_tool import StagehandTool
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| instruction | str | No | Natural language instruction for the browser action; auto-generated if omitted based on command_type |
| url | str | No | URL to navigate to; required for "navigate" command type |
| command_type | str | No | Type of command: "act" (default), "navigate", "extract", or "observe" |
Constructor Parameters
| Name | Type | Required | Description |
|---|---|---|---|
| api_key | str | Yes | Browserbase API key (or set BROWSERBASE_API_KEY env var) |
| project_id | str | Yes | Browserbase project ID (or set BROWSERBASE_PROJECT_ID env var) |
| model_api_key | str | No | LLM API key (auto-detected from OPENAI_API_KEY, ANTHROPIC_API_KEY, or GOOGLE_API_KEY) |
| model_name | str | No | AI model for browser intelligence (default: Claude 3.7 Sonnet) |
| server_url | str | No | Stagehand API URL (default: "https://api.stagehand.browserbase.com/v1") |
| headless | bool | No | Run browser in headless mode (default: False) |
| dom_settle_timeout_ms | int | No | DOM settle timeout in milliseconds (default: 3000) |
| self_heal | bool | No | Enable self-healing for failed selectors (default: True) |
| wait_for_captcha_solves | bool | No | Wait for CAPTCHA solutions (default: True) |
| verbose | int | No | Logging verbosity: 1=INFO, 2=WARNING, 3=DEBUG (default: 1) |
Outputs
| Name | Type | Description |
|---|---|---|
| return | str | Formatted result string depending on command_type: action message for "act", JSON data for "extract", element descriptions for "observe", or error message on failure |
Usage Examples
Basic Usage
import os
os.environ["BROWSERBASE_API_KEY"] = "your-browserbase-key"
os.environ["BROWSERBASE_PROJECT_ID"] = "your-project-id"
os.environ["ANTHROPIC_API_KEY"] = "your-anthropic-key"
from crewai_tools.tools.stagehand_tool.stagehand_tool import StagehandTool
tool = StagehandTool()
# Navigate to a website
result = tool._run(url="https://example.com", command_type="navigate")
# Perform an action
result = tool._run(instruction="Click the search box in the header", command_type="act")
# Extract data from a page
result = tool._run(
instruction="Extract all product names and prices",
command_type="extract"
)
# Observe page elements
result = tool._run(
instruction="Find all navigation menu items",
command_type="observe"
)
# Multi-step action
result = tool._run(
instruction="Step 1: Click the login button; Step 2: Type 'user@example.com' in the email field",
command_type="act"
)
# Context manager for resource cleanup
with StagehandTool() as tool:
tool._run(url="https://example.com", command_type="navigate")
tool._run(instruction="Click the submit button", command_type="act")