Implementation:Langchain ai Langchain ExaSearchRetriever

Knowledge Sources	Langchain_ai_Langchain
Domains	Retrieval, Web Search, Exa
Last Updated	2026-02-11 00:00 GMT

Overview

ExaSearchRetriever is a LangChain retriever that uses the Exa Search API to find and return web documents as LangChain Document objects.

Description

ExaSearchRetriever extends BaseRetriever from langchain_core to integrate the Exa Search API into the LangChain retrieval framework. It supports neural, keyword, and auto search types, with configurable options for domain filtering, date-based filtering, content highlighting, summaries, and live crawling. Results are returned as Document objects with rich metadata including title, URL, score, published date, author, highlights, and summaries.

Usage

Import this class when you need a LangChain-compatible retriever backed by the Exa Search API for web content retrieval in RAG pipelines or agent workflows.

Code Reference

Source Location

Repository: Langchain_ai_Langchain
File: libs/partners/exa/langchain_exa/retrievers.py
Lines: 1-111

Signature

class ExaSearchRetriever(BaseRetriever):
    k: int = 10
    include_domains: list[str] | None = None
    exclude_domains: list[str] | None = None
    start_crawl_date: str | None = None
    end_crawl_date: str | None = None
    start_published_date: str | None = None
    end_published_date: str | None = None
    use_autoprompt: bool | None = None
    type: str = "neural"
    highlights: HighlightsContentsOptions | bool | None = None
    text_contents_options: TextContentsOptions | dict[str, Any] | Literal[True] = True
    livecrawl: Literal["always", "fallback", "never"] | None = None
    summary: bool | dict[str, str] | None = None
    client: Exa
    exa_api_key: SecretStr
    exa_base_url: str | None = None

Import

from langchain_exa import ExaSearchRetriever

I/O Contract

Inputs

Name	Type	Required	Description
k	`int`	No	Number of search results to return (1 to 100). Default: 10.
include_domains	None	No	Domains to include in the search.
exclude_domains	None	No	Domains to exclude from the search.
start_crawl_date	None	No	Start date for the crawl (YYYY-MM-DD format).
end_crawl_date	None	No	End date for the crawl (YYYY-MM-DD format).
start_published_date	None	No	Start date for document publication (YYYY-MM-DD format).
end_published_date	None	No	End date for document publication (YYYY-MM-DD format).
use_autoprompt	None	No	Whether to use autoprompt for the search.
type	`str`	No	Search type: `"neural"`, `"keyword"`, or `"auto"`. Default: `"neural"`.
highlights	bool \| None	No	Whether to include highlights in the results.
text_contents_options	dict[str, Any] \| Literal[True]	No	How to set page content. Default: `True`.
livecrawl	None	No	Live crawl option for pages not in the index.
summary	dict[str, str] \| None	No	Whether to include a content summary.
exa_api_key	`SecretStr`	No	Exa API key. Read from `EXA_API_KEY` environment variable.

Outputs

Name	Type	Description
documents	`list[Document]`	List of LangChain `Document` objects with `page_content` from the result text and metadata containing title, URL, id, score, published_date, author, highlights, highlight_scores, and summary.

Usage Examples

Basic Usage

from langchain_exa import ExaSearchRetriever

retriever = ExaSearchRetriever(
    k=5,
    type="neural",
)

docs = retriever.invoke("latest advances in AI")
for doc in docs:
    print(doc.metadata["title"], doc.metadata["url"])
    print(doc.page_content[:200])

Related Pages

Requires langchain-exa and exa-py packages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment