Implementation:Langchain ai Langchain ExaSearchRetriever
| Knowledge Sources | |
|---|---|
| Domains | Retrieval, Web Search, Exa |
| Last Updated | 2026-02-11 00:00 GMT |
Overview
ExaSearchRetriever is a LangChain retriever that uses the Exa Search API to find and return web documents as LangChain Document objects.
Description
ExaSearchRetriever extends BaseRetriever from langchain_core to integrate the Exa Search API into the LangChain retrieval framework. It supports neural, keyword, and auto search types, with configurable options for domain filtering, date-based filtering, content highlighting, summaries, and live crawling. Results are returned as Document objects with rich metadata including title, URL, score, published date, author, highlights, and summaries.
Usage
Import this class when you need a LangChain-compatible retriever backed by the Exa Search API for web content retrieval in RAG pipelines or agent workflows.
Code Reference
Source Location
- Repository: Langchain_ai_Langchain
- File:
libs/partners/exa/langchain_exa/retrievers.py - Lines: 1-111
Signature
class ExaSearchRetriever(BaseRetriever):
k: int = 10
include_domains: list[str] | None = None
exclude_domains: list[str] | None = None
start_crawl_date: str | None = None
end_crawl_date: str | None = None
start_published_date: str | None = None
end_published_date: str | None = None
use_autoprompt: bool | None = None
type: str = "neural"
highlights: HighlightsContentsOptions | bool | None = None
text_contents_options: TextContentsOptions | dict[str, Any] | Literal[True] = True
livecrawl: Literal["always", "fallback", "never"] | None = None
summary: bool | dict[str, str] | None = None
client: Exa
exa_api_key: SecretStr
exa_base_url: str | None = None
Import
from langchain_exa import ExaSearchRetriever
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| k | int |
No | Number of search results to return (1 to 100). Default: 10. |
| include_domains | None | No | Domains to include in the search. |
| exclude_domains | None | No | Domains to exclude from the search. |
| start_crawl_date | None | No | Start date for the crawl (YYYY-MM-DD format). |
| end_crawl_date | None | No | End date for the crawl (YYYY-MM-DD format). |
| start_published_date | None | No | Start date for document publication (YYYY-MM-DD format). |
| end_published_date | None | No | End date for document publication (YYYY-MM-DD format). |
| use_autoprompt | None | No | Whether to use autoprompt for the search. |
| type | str |
No | Search type: "neural", "keyword", or "auto". Default: "neural".
|
| highlights | bool | None | No | Whether to include highlights in the results. |
| text_contents_options | dict[str, Any] | Literal[True] | No | How to set page content. Default: True.
|
| livecrawl | None | No | Live crawl option for pages not in the index. |
| summary | dict[str, str] | None | No | Whether to include a content summary. |
| exa_api_key | SecretStr |
No | Exa API key. Read from EXA_API_KEY environment variable.
|
Outputs
| Name | Type | Description |
|---|---|---|
| documents | list[Document] |
List of LangChain Document objects with page_content from the result text and metadata containing title, URL, id, score, published_date, author, highlights, highlight_scores, and summary.
|
Usage Examples
Basic Usage
from langchain_exa import ExaSearchRetriever
retriever = ExaSearchRetriever(
k=5,
type="neural",
)
docs = retriever.invoke("latest advances in AI")
for doc in docs:
print(doc.metadata["title"], doc.metadata["url"])
print(doc.page_content[:200])
Related Pages
- Requires
langchain-exaandexa-pypackages