Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Run llama Llama index Node Postprocessors

From Leeroopedia
Knowledge Sources
Domains Retrieval, Postprocessing, Reranking
Last Updated 2026-02-11 19:00 GMT

Overview

Provides core node postprocessor classes that filter, reorder, and augment retrieved nodes based on similarity scores, keywords, document relationships, and context positioning strategies.

Description

The node.py module defines several BaseNodePostprocessor subclasses used to refine retrieval results after an initial retrieval step:

  • KeywordNodePostprocessor filters nodes based on required and excluded keywords. It uses spaCy's PhraseMatcher to match keywords against node content. Nodes that lack any required keyword or contain any excluded keyword are removed from the result set. The lang parameter controls the spaCy language model used for tokenization.
  • SimilarityPostprocessor filters nodes by a similarity_cutoff threshold. Nodes with a score below the cutoff (or with no score at all) are excluded from the results.
  • PrevNextNodePostprocessor expands the retrieved node set by traversing document relationships (NEXT and PREVIOUS) in the document store. It supports three modes: "next" (forward traversal only), "previous" (backward traversal only), and "both" (both directions). After expansion, nodes are sorted by their relationship ordering.
  • AutoPrevNextNodePostprocessor automatically infers whether to fetch previous or next context by using an LLM to predict the appropriate traversal direction. It employs a response synthesizer with configurable prompt templates (infer_prev_next_tmpl and refine_prev_next_tmpl) to determine whether the answer lies in prior context, future context, or neither.
  • LongContextReorder reorders nodes based on the research finding (from Liu et al., 2023) that LLMs perform best when important information is at the beginning or end of the context. It interleaves nodes by placing even-indexed nodes (by ascending score) at the front and odd-indexed nodes at the back.

The module also provides two helper functions, get_forward_nodes and get_backward_nodes, which perform iterative traversal of node relationships in a BaseDocumentStore.

Usage

Use these postprocessors in a retrieval pipeline to improve the quality and relevance of retrieved context before it is passed to a response synthesis step. SimilarityPostprocessor is the most commonly used for basic score filtering. KeywordNodePostprocessor is useful for enforcing domain-specific term requirements. PrevNextNodePostprocessor and AutoPrevNextNodePostprocessor are valuable when documents have sequential structure. LongContextReorder helps maximize LLM performance on long retrieved contexts.

Code Reference

Source Location

  • Repository: Run_llama_Llama_index
  • File: llama-index-core/llama_index/core/postprocessor/node.py
  • Lines: 1-396

Signature

class KeywordNodePostprocessor(BaseNodePostprocessor):
    required_keywords: List[str] = Field(default_factory=list)
    exclude_keywords: List[str] = Field(default_factory=list)
    lang: str = Field(default="en")
    def _postprocess_nodes(
        self, nodes: List[NodeWithScore], query_bundle: Optional[QueryBundle] = None,
    ) -> List[NodeWithScore]: ...

class SimilarityPostprocessor(BaseNodePostprocessor):
    similarity_cutoff: float = Field(default=0.0)
    def _postprocess_nodes(
        self, nodes: List[NodeWithScore], query_bundle: Optional[QueryBundle] = None,
    ) -> List[NodeWithScore]: ...

class PrevNextNodePostprocessor(BaseNodePostprocessor):
    docstore: BaseDocumentStore
    num_nodes: int = Field(default=1)
    mode: str = Field(default="next")
    def _postprocess_nodes(
        self, nodes: List[NodeWithScore], query_bundle: Optional[QueryBundle] = None,
    ) -> List[NodeWithScore]: ...

class AutoPrevNextNodePostprocessor(BaseNodePostprocessor):
    docstore: BaseDocumentStore
    llm: Optional[SerializeAsAny[LLM]] = None
    num_nodes: int = Field(default=1)
    infer_prev_next_tmpl: str
    refine_prev_next_tmpl: str
    verbose: bool = Field(default=False)
    response_mode: ResponseMode = Field(default=ResponseMode.COMPACT)
    def _postprocess_nodes(
        self, nodes: List[NodeWithScore], query_bundle: Optional[QueryBundle] = None,
    ) -> List[NodeWithScore]: ...

class LongContextReorder(BaseNodePostprocessor):
    def _postprocess_nodes(
        self, nodes: List[NodeWithScore], query_bundle: Optional[QueryBundle] = None,
    ) -> List[NodeWithScore]: ...

Import

from llama_index.core.postprocessor.node import SimilarityPostprocessor
from llama_index.core.postprocessor.node import KeywordNodePostprocessor
from llama_index.core.postprocessor.node import PrevNextNodePostprocessor
from llama_index.core.postprocessor.node import AutoPrevNextNodePostprocessor
from llama_index.core.postprocessor.node import LongContextReorder

I/O Contract

Inputs (SimilarityPostprocessor)

Name Type Required Description
similarity_cutoff float No Minimum similarity score threshold (default: 0.0); nodes scoring below this are filtered out

Inputs (KeywordNodePostprocessor)

Name Type Required Description
required_keywords List[str] No Keywords that must appear in the node content for it to be kept
exclude_keywords List[str] No Keywords that cause a node to be excluded if present
lang str No SpaCy language code for tokenization (default: "en")

Inputs (PrevNextNodePostprocessor)

Name Type Required Description
docstore BaseDocumentStore Yes Document store for retrieving related nodes
num_nodes int No Number of adjacent nodes to fetch in each direction (default: 1)
mode str No Direction of traversal: "next", "previous", or "both" (default: "next")

Inputs (AutoPrevNextNodePostprocessor)

Name Type Required Description
docstore BaseDocumentStore Yes Document store for retrieving related nodes
llm Optional[LLM] No LLM to use for direction inference; defaults to Settings.llm
num_nodes int No Number of adjacent nodes to fetch (default: 1)
infer_prev_next_tmpl str No Prompt template for inferring traversal direction
refine_prev_next_tmpl str No Prompt template for refining the inference
verbose bool No Whether to print debug information (default: False)
response_mode ResponseMode No Response synthesis mode (default: COMPACT)

Outputs

Name Type Description
_postprocess_nodes() List[NodeWithScore] Filtered, expanded, or reordered list of nodes with scores

Usage Examples

Basic Usage with SimilarityPostprocessor

from llama_index.core.postprocessor.node import SimilarityPostprocessor

postprocessor = SimilarityPostprocessor(similarity_cutoff=0.7)

# Use in a query engine pipeline
from llama_index.core import VectorStoreIndex

index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine(
    node_postprocessors=[postprocessor]
)
response = query_engine.query("What is the topic?")

KeywordNodePostprocessor

from llama_index.core.postprocessor.node import KeywordNodePostprocessor

postprocessor = KeywordNodePostprocessor(
    required_keywords=["machine learning"],
    exclude_keywords=["deprecated"],
)

LongContextReorder

from llama_index.core.postprocessor.node import LongContextReorder

reorder = LongContextReorder()
# Reorders nodes so highest-relevance items are at the beginning and end

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment