Implementation:CrewAIInc CrewAI Tavily Extractor Tool
| Knowledge Sources | |
|---|---|
| Domains | Tools, Web_Extraction |
| Last Updated | 2026-02-11 00:00 GMT |
Overview
TavilyExtractorTool extracts structured content from one or more web pages using the Tavily API.
Description
The TavilyExtractorTool extends BaseTool and wraps both synchronous (TavilyClient) and asynchronous (AsyncTavilyClient) Tavily clients. On initialization, it creates the clients using the provided API key or the TAVILY_API_KEY environment variable, with optional proxy configuration. If the tavily-python package is missing, it interactively prompts the user for installation via click.confirm. The _run and _arun methods accept a URL or list of URLs, call the Tavily extract API with configurable extract_depth ("basic" or "advanced"), include_images flag, and timeout, and return the results as a formatted JSON string.
Usage
Use this tool when a CrewAI agent needs to extract and structure content from specific web pages, supporting both single URL and batch extraction for web content analysis workflows.
Code Reference
Source Location
- Repository: CrewAI
- File: lib/crewai-tools/src/crewai_tools/tools/tavily_extractor_tool/tavily_extractor_tool.py
- Lines: 1-176
Signature
class TavilyExtractorToolSchema(BaseModel):
urls: list[str] | str = Field(..., description="The URL(s) to extract data from. ...")
class TavilyExtractorTool(BaseTool):
name: str = "TavilyExtractorTool"
description: str = "Extracts content from one or more web pages using the Tavily API. ..."
args_schema: type[BaseModel] = TavilyExtractorToolSchema
api_key: str | None = Field(default_factory=lambda: os.getenv("TAVILY_API_KEY"))
proxies: dict[str, str] | None = None
include_images: bool = False
extract_depth: Literal["basic", "advanced"] = "basic"
timeout: int = 60
env_vars: list[EnvVar] # TAVILY_API_KEY
def _run(self, urls: list[str] | str) -> str:
...
async def _arun(self, urls: list[str] | str) -> str:
...
Import
from crewai_tools import TavilyExtractorTool
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| urls | list[str] or str | Yes | The URL or list of URLs to extract data from |
Outputs
| Name | Type | Description |
|---|---|---|
| _run() returns | str | JSON string containing the extracted structured data from the provided URLs |
Usage Examples
Basic Usage
from crewai_tools import TavilyExtractorTool
tool = TavilyExtractorTool(extract_depth="basic")
result = tool._run(urls="https://example.com")
Multiple URLs
from crewai_tools import TavilyExtractorTool
tool = TavilyExtractorTool(extract_depth="advanced", include_images=True)
result = tool._run(urls=["https://example.com/page1", "https://example.com/page2"])