Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Langchain ai Langchain ExaSearchRetriever

From Leeroopedia
Knowledge Sources
Domains Retrieval, Web Search, Exa
Last Updated 2026-02-11 00:00 GMT

Overview

ExaSearchRetriever is a LangChain retriever that uses the Exa Search API to find and return web documents as LangChain Document objects.

Description

ExaSearchRetriever extends BaseRetriever from langchain_core to integrate the Exa Search API into the LangChain retrieval framework. It supports neural, keyword, and auto search types, with configurable options for domain filtering, date-based filtering, content highlighting, summaries, and live crawling. Results are returned as Document objects with rich metadata including title, URL, score, published date, author, highlights, and summaries.

Usage

Import this class when you need a LangChain-compatible retriever backed by the Exa Search API for web content retrieval in RAG pipelines or agent workflows.

Code Reference

Source Location

Signature

class ExaSearchRetriever(BaseRetriever):
    k: int = 10
    include_domains: list[str] | None = None
    exclude_domains: list[str] | None = None
    start_crawl_date: str | None = None
    end_crawl_date: str | None = None
    start_published_date: str | None = None
    end_published_date: str | None = None
    use_autoprompt: bool | None = None
    type: str = "neural"
    highlights: HighlightsContentsOptions | bool | None = None
    text_contents_options: TextContentsOptions | dict[str, Any] | Literal[True] = True
    livecrawl: Literal["always", "fallback", "never"] | None = None
    summary: bool | dict[str, str] | None = None
    client: Exa
    exa_api_key: SecretStr
    exa_base_url: str | None = None

Import

from langchain_exa import ExaSearchRetriever

I/O Contract

Inputs

Name Type Required Description
k int No Number of search results to return (1 to 100). Default: 10.
include_domains None No Domains to include in the search.
exclude_domains None No Domains to exclude from the search.
start_crawl_date None No Start date for the crawl (YYYY-MM-DD format).
end_crawl_date None No End date for the crawl (YYYY-MM-DD format).
start_published_date None No Start date for document publication (YYYY-MM-DD format).
end_published_date None No End date for document publication (YYYY-MM-DD format).
use_autoprompt None No Whether to use autoprompt for the search.
type str No Search type: "neural", "keyword", or "auto". Default: "neural".
highlights bool | None No Whether to include highlights in the results.
text_contents_options dict[str, Any] | Literal[True] No How to set page content. Default: True.
livecrawl None No Live crawl option for pages not in the index.
summary dict[str, str] | None No Whether to include a content summary.
exa_api_key SecretStr No Exa API key. Read from EXA_API_KEY environment variable.

Outputs

Name Type Description
documents list[Document] List of LangChain Document objects with page_content from the result text and metadata containing title, URL, id, score, published_date, author, highlights, highlight_scores, and summary.

Usage Examples

Basic Usage

from langchain_exa import ExaSearchRetriever

retriever = ExaSearchRetriever(
    k=5,
    type="neural",
)

docs = retriever.invoke("latest advances in AI")
for doc in docs:
    print(doc.metadata["title"], doc.metadata["url"])
    print(doc.page_content[:200])

Related Pages

  • Requires langchain-exa and exa-py packages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment