Implementation:Run llama Llama index BaseRetriever
| Knowledge Sources | |
|---|---|
| Domains | LLM Framework, Retrieval |
| Last Updated | 2026-02-11 19:00 GMT |
Overview
BaseRetriever is the foundational abstract base class for all retrievers in LlamaIndex, providing the core retrieval interface with callback management, instrumentation, recursive object retrieval, and deduplication.
Description
The BaseRetriever class inherits from PromptMixin and DispatcherSpanMixin. It is the most widely-subclassed retriever base in LlamaIndex and provides a rich set of features:
Constructor parameters:
- callback_manager -- Optional CallbackManager for tracing; defaults to an empty manager.
- object_map -- Optional dictionary mapping index IDs to retrievable objects.
- objects -- Optional list of IndexNode objects; if provided, these are converted into object_map entries.
- verbose -- Boolean flag to enable verbose output during retrieval.
Public methods:
- retrieve(str_or_query_bundle) -- The main synchronous entry point. Converts strings to QueryBundle, fires RetrievalStartEvent and RetrievalEndEvent, wraps execution in callback traces, calls the abstract _retrieve method, and then handles recursive retrieval via _handle_recursive_retrieval.
- aretrieve(str_or_query_bundle) -- Async counterpart of retrieve.
Recursive retrieval:
- _handle_recursive_retrieval and _ahandle_recursive_retrieval iterate over retrieved nodes and, for any IndexNode that references an object in the object_map, recursively retrieves from that object. Supported object types include NodeWithScore, BaseNode, BaseQueryEngine, and BaseRetriever.
- After recursive resolution, duplicates are removed based on node hash (sync) or hash + ref_doc_id (async).
Object retrieval helpers:
- _retrieve_from_object and _aretrieve_from_object handle dispatching retrieval to different object types (nodes, query engines, sub-retrievers).
Abstract method (must be implemented by subclasses):
- _retrieve(query_bundle) -- The core synchronous retrieval logic.
Virtual method (can be overridden):
- _aretrieve(query_bundle) -- Async retrieval; defaults to calling _retrieve synchronously.
Usage
Subclass BaseRetriever to create any custom retriever. All built-in LlamaIndex retrievers (vector, keyword, knowledge graph, etc.) extend this class. Use the public retrieve() method as the primary interface. The recursive retrieval feature enables composable retriever architectures where IndexNode objects can reference sub-retrievers or query engines.
Code Reference
Source Location
- Repository: Run_llama_Llama_index
- File: llama-index-core/llama_index/core/base/base_retriever.py
- Lines: 1-274
Signature
class BaseRetriever(PromptMixin, DispatcherSpanMixin):
"""Base retriever."""
def __init__(
self,
callback_manager: Optional[CallbackManager] = None,
object_map: Optional[Dict] = None,
objects: Optional[List[IndexNode]] = None,
verbose: bool = False,
) -> None: ...
def retrieve(self, str_or_query_bundle: QueryType) -> List[NodeWithScore]: ...
async def aretrieve(self, str_or_query_bundle: QueryType) -> List[NodeWithScore]: ...
@abstractmethod
def _retrieve(self, query_bundle: QueryBundle) -> List[NodeWithScore]: ...
async def _aretrieve(self, query_bundle: QueryBundle) -> List[NodeWithScore]: ...
Import
from llama_index.core.base.base_retriever import BaseRetriever
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| callback_manager | Optional[CallbackManager] | No | Callback manager for event tracing. Defaults to an empty CallbackManager. |
| object_map | Optional[Dict] | No | Mapping from index IDs to retrievable objects (BaseRetriever, BaseQueryEngine, BaseNode, etc.). |
| objects | Optional[List[IndexNode]] | No | List of IndexNode objects to build the object_map from. Mutually usable with object_map. |
| verbose | bool | No | If True, prints retrieval progress information. Defaults to False. |
| str_or_query_bundle | QueryType (str or QueryBundle) | Yes | The query to retrieve against. Passed to retrieve() or aretrieve(). |
Outputs
| Name | Type | Description |
|---|---|---|
| return | List[NodeWithScore] | A list of retrieved nodes with associated relevance scores, deduplicated after recursive retrieval. |
Usage Examples
Basic Usage
from llama_index.core.base.base_retriever import BaseRetriever
from llama_index.core.schema import NodeWithScore, QueryBundle, TextNode
from typing import List
class SimpleRetriever(BaseRetriever):
def __init__(self, nodes: List[TextNode]):
super().__init__()
self._nodes = nodes
def _retrieve(self, query_bundle: QueryBundle) -> List[NodeWithScore]:
# Simple keyword matching retrieval
results = []
for node in self._nodes:
if query_bundle.query_str.lower() in node.text.lower():
results.append(NodeWithScore(node=node, score=1.0))
return results
# Usage
nodes = [TextNode(text="LlamaIndex is a data framework for LLM applications.")]
retriever = SimpleRetriever(nodes=nodes)
results = retriever.retrieve("LlamaIndex")
for r in results:
print(r.node.text, r.score)
Recursive Retrieval with Object Map
from llama_index.core.schema import IndexNode
# Create a sub-retriever and register it via objects
sub_retriever = SimpleRetriever(nodes=sub_nodes)
index_node = IndexNode(text="Sub-index", index_id="sub-1", obj=sub_retriever)
parent_retriever = SimpleRetriever(nodes=[index_node])
parent_retriever.object_map = {"sub-1": sub_retriever}
# When parent retrieves an IndexNode referencing "sub-1",
# it automatically delegates to sub_retriever
results = parent_retriever.retrieve("my query")