Implementation:Run llama Llama index RecursiveRetriever
Overview
The RecursiveRetriever implements a graph-based recursive retrieval strategy. Starting from a root retriever, it follows IndexNode links to other retrievers, query engines, or plain nodes, recursively expanding the retrieval graph. This allows building multi-level retrieval pipelines where results from one retriever can reference other retrievers or query engines.
Source File: llama-index-core/llama_index/core/retrievers/recursive_retriever.py (221 lines)
Module: llama_index.core.retrievers.recursive_retriever
Class Definition
class RecursiveRetriever(BaseRetriever):
"""
Recursive retriever.
This retriever will recursively explore links from nodes to other
retrievers/query engines.
For any retrieved nodes, if any of the nodes are IndexNodes,
then it will explore the linked retriever/query engine, and query that.
"""
Type Alias
RQN_TYPE = Union[BaseRetriever, BaseQueryEngine, BaseNode]
This union type represents the three kinds of objects that can be referenced in the recursive graph: retrievers, query engines, and plain nodes.
Dependencies
| Module | Import |
|---|---|
llama_index.core.base.base_query_engine |
BaseQueryEngine
|
llama_index.core.base.base_retriever |
BaseRetriever
|
llama_index.core.callbacks.base |
CallbackManager
|
llama_index.core.callbacks.schema |
CBEventType, EventPayload
|
llama_index.core.schema |
BaseNode, IndexNode, NodeWithScore, QueryBundle, TextNode
|
llama_index.core.utils |
print_text
|
Constants
DEFAULT_QUERY_RESPONSE_TMPL = "Query: {query_str}\nResponse: {response}"
Template used to format query engine responses as text nodes.
Constructor
def __init__(
self,
root_id: str,
retriever_dict: Dict[str, BaseRetriever],
query_engine_dict: Optional[Dict[str, BaseQueryEngine]] = None,
node_dict: Optional[Dict[str, BaseNode]] = None,
callback_manager: Optional[CallbackManager] = None,
query_response_tmpl: Optional[str] = None,
verbose: bool = False,
) -> None
| Parameter | Type | Default | Description |
|---|---|---|---|
root_id |
str |
required | The ID of the root retriever in retriever_dict that serves as the entry point
|
retriever_dict |
Dict[str, BaseRetriever] |
required | Mapping of string IDs to retriever instances |
query_engine_dict |
Optional[Dict[str, BaseQueryEngine]] |
None |
Mapping of string IDs to query engine instances |
node_dict |
Optional[Dict[str, BaseNode]] |
None |
Mapping of string IDs to plain node instances |
callback_manager |
Optional[CallbackManager] |
None |
Optional callback manager |
query_response_tmpl |
Optional[str] |
None |
Template for formatting query engine responses |
verbose |
bool |
False |
Whether to print verbose retrieval info |
Validation:
- The
root_idmust exist inretriever_dict(raisesValueErrorotherwise). - Keys in
retriever_dictandquery_engine_dictmust not overlap (raisesValueErrorotherwise).
Core Methods
_retrieve
def _retrieve(self, query_bundle: QueryBundle) -> List[NodeWithScore]
Public retrieval entry point. Calls _retrieve_rec with no initial query ID (defaults to root), returning only the primary retrieved nodes (discarding additional source nodes).
retrieve_all
def retrieve_all(
self, query_bundle: QueryBundle
) -> Tuple[List[NodeWithScore], List[NodeWithScore]]
Extended retrieval that returns both the primary retrieved nodes and additional source nodes (e.g., source nodes from sub-query engine responses).
_retrieve_rec
def _retrieve_rec(
self,
query_bundle: QueryBundle,
query_id: Optional[str] = None,
cur_similarity: Optional[float] = None,
) -> Tuple[List[NodeWithScore], List[NodeWithScore]]
The core recursive method. Behavior depends on the type of object resolved by query_id:
| Object Type | Behavior |
|---|---|
BaseNode |
Returns the node directly with the current similarity score |
BaseRetriever |
Calls retrieve(), then recursively processes results via _query_retrieved_nodes()
|
BaseQueryEngine |
Calls query(), formats the response as a TextNode using the query response template, and collects source nodes
|
_query_retrieved_nodes
def _query_retrieved_nodes(
self, query_bundle: QueryBundle, nodes_with_score: List[NodeWithScore]
) -> Tuple[List[NodeWithScore], List[NodeWithScore]]
Processes retrieved nodes. For each node:
- If it is an
IndexNode, follows the link by recursively calling_retrieve_recwith the node'sindex_id. - If it is a
TextNode, keeps it as-is.
Deduplicates IndexNode references to the same index_id before recursing. After recursion, deduplicates all results by node ID.
_get_object
def _get_object(self, query_id: str) -> RQN_TYPE
Resolves a query_id to its corresponding object by searching in order: node_dict, retriever_dict, query_engine_dict. Raises ValueError if not found.
_deduplicate_nodes
def _deduplicate_nodes(
self, nodes_with_score: List[NodeWithScore]
) -> List[NodeWithScore]
Removes duplicate nodes by node.id_, keeping the first occurrence (which has the highest score if results are ordered).
Recursion Flow
_retrieve(query)
-> _retrieve_rec(query, root_id)
-> retriever.retrieve(query)
-> for each result node:
if IndexNode:
-> _retrieve_rec(query, node.index_id)
-> (may call another retriever, query engine, or return a node)
if TextNode:
-> return node directly
Design Notes
- The three dictionaries (
retriever_dict,query_engine_dict,node_dict) form a flat namespace. The recursive graph structure emerges fromIndexNodereferences in retrieval results. - Query engine responses are converted to
TextNodeobjects so they can be uniformly handled alongside retrieved text nodes. - The callback manager fires
CBEventType.RETRIEVEevents when entering a sub-retriever, enabling observability into the recursive chain. - Verbose mode uses colored terminal output (
pinkfor node entering,bluefor query IDs,greenfor query engine responses).
See Also
- AutoMergingRetriever -- Hierarchical retrieval via parent-child merging
- RouterRetriever -- Selector-based retriever routing