Implementation:CrewAIInc CrewAI Tavily Extractor Tool

Knowledge Sources	CrewAI
Domains	Tools, Web_Extraction
Last Updated	2026-02-11 00:00 GMT

Overview

TavilyExtractorTool extracts structured content from one or more web pages using the Tavily API.

Description

The TavilyExtractorTool extends BaseTool and wraps both synchronous (TavilyClient) and asynchronous (AsyncTavilyClient) Tavily clients. On initialization, it creates the clients using the provided API key or the TAVILY_API_KEY environment variable, with optional proxy configuration. If the tavily-python package is missing, it interactively prompts the user for installation via click.confirm. The _run and _arun methods accept a URL or list of URLs, call the Tavily extract API with configurable extract_depth ("basic" or "advanced"), include_images flag, and timeout, and return the results as a formatted JSON string.

Usage

Use this tool when a CrewAI agent needs to extract and structure content from specific web pages, supporting both single URL and batch extraction for web content analysis workflows.

Code Reference

Source Location

Repository: CrewAI
File: lib/crewai-tools/src/crewai_tools/tools/tavily_extractor_tool/tavily_extractor_tool.py
Lines: 1-176

Signature

class TavilyExtractorToolSchema(BaseModel):
    urls: list[str] | str = Field(..., description="The URL(s) to extract data from. ...")

class TavilyExtractorTool(BaseTool):
    name: str = "TavilyExtractorTool"
    description: str = "Extracts content from one or more web pages using the Tavily API. ..."
    args_schema: type[BaseModel] = TavilyExtractorToolSchema
    api_key: str | None = Field(default_factory=lambda: os.getenv("TAVILY_API_KEY"))
    proxies: dict[str, str] | None = None
    include_images: bool = False
    extract_depth: Literal["basic", "advanced"] = "basic"
    timeout: int = 60
    env_vars: list[EnvVar]  # TAVILY_API_KEY

    def _run(self, urls: list[str] | str) -> str:
        ...
    async def _arun(self, urls: list[str] | str) -> str:
        ...

Import

from crewai_tools import TavilyExtractorTool

I/O Contract

Inputs

Name	Type	Required	Description
urls	list[str] or str	Yes	The URL or list of URLs to extract data from

Outputs

Name	Type	Description
_run() returns	str	JSON string containing the extracted structured data from the provided URLs

Usage Examples

Basic Usage

from crewai_tools import TavilyExtractorTool

tool = TavilyExtractorTool(extract_depth="basic")
result = tool._run(urls="https://example.com")

Multiple URLs

from crewai_tools import TavilyExtractorTool

tool = TavilyExtractorTool(extract_depth="advanced", include_images=True)
result = tool._run(urls=["https://example.com/page1", "https://example.com/page2"])

Related Pages

Principle:CrewAIInc_CrewAI_Built_In_Tool_Selection

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment