Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:CrewAIInc CrewAI Stagehand Tool

From Leeroopedia
Revision as of 11:09, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/CrewAIInc_CrewAI_Stagehand_Tool.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Knowledge Sources
Domains Browser Automation, Web Interaction, Tool Integration
Last Updated 2026-02-11 00:00 GMT

Overview

StagehandTool is a CrewAI tool that automates web browser interactions using natural language instructions through the Stagehand framework and Browserbase infrastructure, supporting actions, navigation, data extraction, and element observation.

Description

The tool extends BaseTool and provides the most sophisticated browser automation capabilities in the CrewAI toolkit. It conditionally imports the Stagehand SDK, falling back to type stubs and mock classes if unavailable.

The tool supports four command types:

  • act (default) -- Perform atomic browser actions like clicking buttons, filling forms, typing text, and scrolling. Multi-step instructions are parsed from numbered steps ("Step 1:", "Step 2:") or semicolon-separated instructions and executed sequentially with retry logic.
  • navigate -- Navigate directly to a URL using page.goto.
  • extract -- Extract structured data from web pages using ExtractOptions with text extraction mode.
  • observe -- Identify and analyze visible elements on a page using ObserveOptions with visibility filtering.

Key implementation features:

  • Intelligent action decomposition -- The _extract_steps method parses complex instructions into atomic steps for reliable execution.
  • Retry with simplification -- Failed actions are retried with simplified instructions via _simplify_instruction, which reduces complex instructions to basic action primitives.
  • Multi-model support -- Automatically detects the appropriate API key (OpenAI, Anthropic, or Google) based on the configured model name.
  • Session management -- Manages Browserbase sessions with lazy initialization, configurable DOM settle timeout, self-healing, and CAPTCHA solving.
  • Async/sync bridging -- The synchronous _run method bridges to the async _async_run using event loop detection and ThreadPoolExecutor fallback.
  • Resource cleanup -- Handles cleanup via close(), __del__, and context manager protocol (__enter__/__exit__).
  • Testing mode -- Built-in MockPage and MockStagehand classes enable unit testing without real browser sessions.

The StagehandResult Pydantic model encapsulates operation outcomes with success, data, and error fields.

Usage

Use this tool when agents need to interact with websites beyond simple HTTP requests -- filling forms, clicking buttons, navigating multi-page flows, extracting dynamic content, or observing page elements. It is ideal for web scraping of JavaScript-heavy sites, form automation, and complex web workflows.

Code Reference

Source Location

  • Repository: CrewAI
  • File: lib/crewai-tools/src/crewai_tools/tools/stagehand_tool/stagehand_tool.py
  • Lines: 1-739

Signature

class StagehandTool(BaseTool):
    name: str = "Web Automation Tool"
    description: str = "Use this tool to control a web browser and interact with websites using natural language."
    args_schema: type[BaseModel] = StagehandToolSchema
    package_dependencies: list[str] = ["stagehand<=0.5.9"]

    # Stagehand configuration
    api_key: str | None = None
    project_id: str | None = None
    model_api_key: str | None = None
    model_name: AvailableModel | None = AvailableModel.CLAUDE_3_7_SONNET_LATEST
    server_url: str | None = "https://api.stagehand.browserbase.com/v1"
    headless: bool = False
    dom_settle_timeout_ms: int = 3000
    self_heal: bool = True
    wait_for_captcha_solves: bool = True
    verbose: int = 1
    max_retries_on_token_limit: int = 3
    use_simplified_dom: bool = True

    def __init__(
        self,
        api_key: str | None = None,
        project_id: str | None = None,
        model_api_key: str | None = None,
        model_name: str | None = None,
        server_url: str | None = None,
        session_id: str | None = None,
        headless: bool | None = None,
        dom_settle_timeout_ms: int | None = None,
        self_heal: bool | None = None,
        wait_for_captcha_solves: bool | None = None,
        verbose: int | None = None,
        _testing: bool = False,
        **kwargs,
    ): ...

Import

from crewai_tools.tools.stagehand_tool.stagehand_tool import StagehandTool

I/O Contract

Inputs

Name Type Required Description
instruction str No Natural language instruction for the browser action; auto-generated if omitted based on command_type
url str No URL to navigate to; required for "navigate" command type
command_type str No Type of command: "act" (default), "navigate", "extract", or "observe"

Constructor Parameters

Name Type Required Description
api_key str Yes Browserbase API key (or set BROWSERBASE_API_KEY env var)
project_id str Yes Browserbase project ID (or set BROWSERBASE_PROJECT_ID env var)
model_api_key str No LLM API key (auto-detected from OPENAI_API_KEY, ANTHROPIC_API_KEY, or GOOGLE_API_KEY)
model_name str No AI model for browser intelligence (default: Claude 3.7 Sonnet)
server_url str No Stagehand API URL (default: "https://api.stagehand.browserbase.com/v1")
headless bool No Run browser in headless mode (default: False)
dom_settle_timeout_ms int No DOM settle timeout in milliseconds (default: 3000)
self_heal bool No Enable self-healing for failed selectors (default: True)
wait_for_captcha_solves bool No Wait for CAPTCHA solutions (default: True)
verbose int No Logging verbosity: 1=INFO, 2=WARNING, 3=DEBUG (default: 1)

Outputs

Name Type Description
return str Formatted result string depending on command_type: action message for "act", JSON data for "extract", element descriptions for "observe", or error message on failure

Usage Examples

Basic Usage

import os
os.environ["BROWSERBASE_API_KEY"] = "your-browserbase-key"
os.environ["BROWSERBASE_PROJECT_ID"] = "your-project-id"
os.environ["ANTHROPIC_API_KEY"] = "your-anthropic-key"

from crewai_tools.tools.stagehand_tool.stagehand_tool import StagehandTool

tool = StagehandTool()

# Navigate to a website
result = tool._run(url="https://example.com", command_type="navigate")

# Perform an action
result = tool._run(instruction="Click the search box in the header", command_type="act")

# Extract data from a page
result = tool._run(
    instruction="Extract all product names and prices",
    command_type="extract"
)

# Observe page elements
result = tool._run(
    instruction="Find all navigation menu items",
    command_type="observe"
)

# Multi-step action
result = tool._run(
    instruction="Step 1: Click the login button; Step 2: Type 'user@example.com' in the email field",
    command_type="act"
)

# Context manager for resource cleanup
with StagehandTool() as tool:
    tool._run(url="https://example.com", command_type="navigate")
    tool._run(instruction="Click the submit button", command_type="act")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment