Implementation:CrewAIInc CrewAI Scrape Element Tool

Knowledge Sources	CrewAI
Domains	Tools, Web_Scraping
Last Updated	2026-02-11 00:00 GMT

Overview

ScrapeElementFromWebsiteTool extracts specific HTML elements from web pages using CSS selectors via BeautifulSoup.

Description

ScrapeElementFromWebsiteTool extends BaseTool and uses a dual-schema pattern: ScrapeElementFromWebsiteToolSchema requires both website_url and css_element as inputs, while FixedScrapeElementFromWebsiteToolSchema is an empty schema used when the URL and CSS element are pre-configured at initialization. The tool includes default browser-like HTTP headers (User-Agent, Accept, etc.) to avoid request blocking. Cookie support reads values from environment variables. The _run() method performs an HTTP GET request using the requests library, parses the HTML with BeautifulSoup, selects elements matching the CSS selector via parsed.select(), and returns the concatenated text content of all matched elements. BeautifulSoup availability is checked at runtime with a helpful error message if missing.

Usage

Use this tool for targeted element extraction from web pages when you know the CSS selector of the content you need. It complements the full-page ScrapeWebsiteTool by allowing precise selection of specific page elements, which is valuable for structured data extraction from known page layouts.

Code Reference

Source Location

Repository: CrewAI
File: lib/crewai-tools/src/crewai_tools/tools/scrape_element_from_website/scrape_element_from_website.py
Lines: 1-92

Signature

class FixedScrapeElementFromWebsiteToolSchema(BaseModel):
    pass

class ScrapeElementFromWebsiteToolSchema(FixedScrapeElementFromWebsiteToolSchema):
    website_url: str = Field(..., description="Mandatory website url to read the file")
    css_element: str = Field(..., description="Mandatory css reference for element to scrape from the website")

class ScrapeElementFromWebsiteTool(BaseTool):
    name: str = "Read a website content"
    description: str = "A tool that can be used to read a website content."
    args_schema: type[BaseModel] = ScrapeElementFromWebsiteToolSchema
    website_url: str | None = None
    cookies: dict | None = None
    css_element: str | None = None
    headers: dict | None  # default browser-like headers

    def __init__(self, website_url=None, cookies=None, css_element=None, **kwargs)
    def _run(self, **kwargs) -> Any

Import

from crewai_tools import ScrapeElementFromWebsiteTool

I/O Contract

Inputs

Name	Type	Required	Description
website_url	str	Yes	URL of the website to scrape (optional if set at init)
css_element	str	Yes	CSS selector for the element(s) to extract (optional if set at init)

Outputs

Name	Type	Description
_run() returns	str	Concatenated text content of all HTML elements matching the CSS selector

Usage Examples

Basic Usage

from crewai_tools import ScrapeElementFromWebsiteTool

# Dynamic URL and selector
tool = ScrapeElementFromWebsiteTool()
result = tool._run(website_url="https://example.com", css_element="h1.title")

# Pre-configured URL and selector
tool = ScrapeElementFromWebsiteTool(
    website_url="https://example.com",
    css_element="div.article-content"
)
result = tool._run()

Related Pages

Principle:CrewAIInc_CrewAI_Built_In_Tool_Selection

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment