Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Run llama Llama index RecursiveRetriever

From Leeroopedia

Overview

The RecursiveRetriever implements a graph-based recursive retrieval strategy. Starting from a root retriever, it follows IndexNode links to other retrievers, query engines, or plain nodes, recursively expanding the retrieval graph. This allows building multi-level retrieval pipelines where results from one retriever can reference other retrievers or query engines.

Source File: llama-index-core/llama_index/core/retrievers/recursive_retriever.py (221 lines)

Module: llama_index.core.retrievers.recursive_retriever

Class Definition

class RecursiveRetriever(BaseRetriever):
    """
    Recursive retriever.

    This retriever will recursively explore links from nodes to other
    retrievers/query engines.

    For any retrieved nodes, if any of the nodes are IndexNodes,
    then it will explore the linked retriever/query engine, and query that.
    """

Type Alias

RQN_TYPE = Union[BaseRetriever, BaseQueryEngine, BaseNode]

This union type represents the three kinds of objects that can be referenced in the recursive graph: retrievers, query engines, and plain nodes.

Dependencies

Module Import
llama_index.core.base.base_query_engine BaseQueryEngine
llama_index.core.base.base_retriever BaseRetriever
llama_index.core.callbacks.base CallbackManager
llama_index.core.callbacks.schema CBEventType, EventPayload
llama_index.core.schema BaseNode, IndexNode, NodeWithScore, QueryBundle, TextNode
llama_index.core.utils print_text

Constants

DEFAULT_QUERY_RESPONSE_TMPL = "Query: {query_str}\nResponse: {response}"

Template used to format query engine responses as text nodes.

Constructor

def __init__(
    self,
    root_id: str,
    retriever_dict: Dict[str, BaseRetriever],
    query_engine_dict: Optional[Dict[str, BaseQueryEngine]] = None,
    node_dict: Optional[Dict[str, BaseNode]] = None,
    callback_manager: Optional[CallbackManager] = None,
    query_response_tmpl: Optional[str] = None,
    verbose: bool = False,
) -> None
Parameter Type Default Description
root_id str required The ID of the root retriever in retriever_dict that serves as the entry point
retriever_dict Dict[str, BaseRetriever] required Mapping of string IDs to retriever instances
query_engine_dict Optional[Dict[str, BaseQueryEngine]] None Mapping of string IDs to query engine instances
node_dict Optional[Dict[str, BaseNode]] None Mapping of string IDs to plain node instances
callback_manager Optional[CallbackManager] None Optional callback manager
query_response_tmpl Optional[str] None Template for formatting query engine responses
verbose bool False Whether to print verbose retrieval info

Validation:

  • The root_id must exist in retriever_dict (raises ValueError otherwise).
  • Keys in retriever_dict and query_engine_dict must not overlap (raises ValueError otherwise).

Core Methods

_retrieve

def _retrieve(self, query_bundle: QueryBundle) -> List[NodeWithScore]

Public retrieval entry point. Calls _retrieve_rec with no initial query ID (defaults to root), returning only the primary retrieved nodes (discarding additional source nodes).

retrieve_all

def retrieve_all(
    self, query_bundle: QueryBundle
) -> Tuple[List[NodeWithScore], List[NodeWithScore]]

Extended retrieval that returns both the primary retrieved nodes and additional source nodes (e.g., source nodes from sub-query engine responses).

_retrieve_rec

def _retrieve_rec(
    self,
    query_bundle: QueryBundle,
    query_id: Optional[str] = None,
    cur_similarity: Optional[float] = None,
) -> Tuple[List[NodeWithScore], List[NodeWithScore]]

The core recursive method. Behavior depends on the type of object resolved by query_id:

Object Type Behavior
BaseNode Returns the node directly with the current similarity score
BaseRetriever Calls retrieve(), then recursively processes results via _query_retrieved_nodes()
BaseQueryEngine Calls query(), formats the response as a TextNode using the query response template, and collects source nodes

_query_retrieved_nodes

def _query_retrieved_nodes(
    self, query_bundle: QueryBundle, nodes_with_score: List[NodeWithScore]
) -> Tuple[List[NodeWithScore], List[NodeWithScore]]

Processes retrieved nodes. For each node:

  • If it is an IndexNode, follows the link by recursively calling _retrieve_rec with the node's index_id.
  • If it is a TextNode, keeps it as-is.

Deduplicates IndexNode references to the same index_id before recursing. After recursion, deduplicates all results by node ID.

_get_object

def _get_object(self, query_id: str) -> RQN_TYPE

Resolves a query_id to its corresponding object by searching in order: node_dict, retriever_dict, query_engine_dict. Raises ValueError if not found.

_deduplicate_nodes

def _deduplicate_nodes(
    self, nodes_with_score: List[NodeWithScore]
) -> List[NodeWithScore]

Removes duplicate nodes by node.id_, keeping the first occurrence (which has the highest score if results are ordered).

Recursion Flow

_retrieve(query)
  -> _retrieve_rec(query, root_id)
       -> retriever.retrieve(query)
            -> for each result node:
                 if IndexNode:
                   -> _retrieve_rec(query, node.index_id)
                        -> (may call another retriever, query engine, or return a node)
                 if TextNode:
                   -> return node directly

Design Notes

  • The three dictionaries (retriever_dict, query_engine_dict, node_dict) form a flat namespace. The recursive graph structure emerges from IndexNode references in retrieval results.
  • Query engine responses are converted to TextNode objects so they can be uniformly handled alongside retrieved text nodes.
  • The callback manager fires CBEventType.RETRIEVE events when entering a sub-retriever, enabling observability into the recursive chain.
  • Verbose mode uses colored terminal output (pink for node entering, blue for query IDs, green for query engine responses).

See Also

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment