Implementation:Run llama Llama index DocumentContextExtractor

Knowledge Sources	Run_llama_Llama_index
Domains	Metadata Extraction, RAG, Contextual Retrieval
Last Updated	2026-02-11 19:00 GMT

Overview

An LLM-based context extractor that generates contextual metadata for document chunks by analyzing each chunk within the context of its parent document, implementing the Anthropic "Contextual Retrieval" approach to enhance RAG accuracy.

Description

DocumentContextExtractor extends BaseExtractor to enrich document nodes with contextual metadata. For each node, it retrieves the parent document from a docstore, sends both the full document and the individual chunk to an LLM, and stores the generated context as node metadata.

The extractor implements several key design decisions:

Prompt strategies: Two built-in prompt constants are provided as class variables:
- ORIGINAL_CONTEXT_PROMPT -- Generates a short succinct context to situate a chunk within the overall document (from the Anthropic cookbook)
- SUCCINCT_CONTEXT_PROMPT -- Generates keyword-laden descriptions for better search matching
Rate limit handling: Exponential backoff retry logic with 5 retries starting at 60-second base delay, with jitter
Oversized document strategies: Configurable handling via OversizeStrategy literal type ("warn", "error", or "ignore")
Token counting: Uses @lru_cache(maxsize=1000) for cached token counting to avoid redundant computation on repeated documents
Prompt caching: Sends the document text with cache_control: ephemeral headers to leverage Anthropic API prompt caching
Sorting optimization: Sorts nodes by source document ID before processing to maximize prompt cache hits, which can save significant API costs
Skip logic: Nodes that already have the metadata key set are skipped to avoid reprocessing

Usage

Use this extractor when you want to improve retrieval accuracy for RAG pipelines where individual chunks may lack standalone semantic meaning. It is particularly effective for documents where chunk boundaries split related content across nodes. Requires an LLM with async chat support (the achat method) and a configured docstore containing the parent documents.

Code Reference

Source Location

Repository: Run_llama_Llama_index
File: llama-index-core/llama_index/core/extractors/document_context.py
Lines: 1-351

Signature

class DocumentContextExtractor(BaseExtractor):
    # Pydantic fields
    llm: LLM
    docstore: BaseDocumentStore
    key: str
    prompt: str
    doc_ids: Set[str]
    max_context_length: int
    max_output_tokens: int
    oversized_document_strategy: OversizeStrategy
    num_workers: int = DEFAULT_NUM_WORKERS

    def __init__(
        self,
        docstore: BaseDocumentStore,
        llm: Optional[LLM] = None,
        max_context_length: int = 1000,
        key: str = DEFAULT_KEY,
        prompt: str = ORIGINAL_CONTEXT_PROMPT,
        num_workers: int = DEFAULT_NUM_WORKERS,
        max_output_tokens: int = 512,
        oversized_document_strategy: OversizeStrategy = "warn",
        **kwargs: Any,
    ) -> None

Import

from llama_index.core.extractors.document_context import DocumentContextExtractor

I/O Contract

Inputs

Name	Type	Required	Description
docstore	BaseDocumentStore	Yes	Storage for parent documents; used to retrieve full document text for each node
llm	Optional[LLM]	No	Language model instance with `achat` method; defaults to `Settings.llm`
max_context_length	int	No	Maximum allowed document context length in tokens (default: 1000)
key	str	No	Metadata key for storing extracted context (default: `"context"`)
prompt	str	No	Prompt template for context generation (default: ORIGINAL_CONTEXT_PROMPT)
num_workers	int	No	Number of parallel workers for async processing (default: DEFAULT_NUM_WORKERS)
max_output_tokens	int	No	Maximum tokens in generated context (default: 512)
oversized_document_strategy	OversizeStrategy	No	Strategy for handling documents exceeding max_context_length: `"warn"`, `"error"`, or `"ignore"` (default: `"warn"`)

Outputs

Name	Type	Description
metadata_list	List[Dict]	List of metadata dictionaries, one per input node, containing the generated context under the configured key

Key Methods

aextract

async def aextract(self, nodes: Sequence[BaseNode]) -> List[Dict]

Main entry point. Sorts nodes by source document ID for prompt cache optimization, retrieves parent documents, and dispatches parallel context generation jobs via run_jobs.

_agenerate_node_context

async def _agenerate_node_context(
    self,
    node: Union[Node, TextNode],
    metadata: Dict,
    document: Union[Node, TextNode],
    prompt: str,
    key: str,
) -> Dict

Generates context for a single node by sending the parent document and chunk to the LLM. Implements exponential backoff retry (5 retries, 60s base delay) for rate limit handling. Uses Anthropic prompt caching headers.

_get_document

async def _get_document(self, doc_id: str) -> Optional[Union[Node, TextNode]]

Retrieves a document from the docstore by ID. Validates that the document is a text node and applies the oversized document strategy if the token count exceeds max_context_length.

_count_tokens

@staticmethod
@lru_cache(maxsize=1000)
def _count_tokens(text: str) -> int

Cached token counting using Settings.tokenizer. The LRU cache avoids redundant tokenization on repeated documents.

Helper Functions

is_text_node

def is_text_node(node: BaseNode) -> TypeGuard[Union[Node, TextNode]]

Module-level type guard function that checks whether a node is an instance of Node or TextNode. Used throughout the extractor for type narrowing.

Constants

Name	Description
`ORIGINAL_CONTEXT_PROMPT`	Prompts the LLM to generate a short succinct context situating a chunk within its document, designed for improving search retrieval
`SUCCINCT_CONTEXT_PROMPT`	Generates keyword-laden phrases describing main topics, entities, and actions; replaces pronouns with specific referents for better search matching
`OversizeStrategy`	Literal type alias for `"warn"`, `"error"`, or `"ignore"`

Usage Examples

Basic Usage

from llama_index.core.extractors.document_context import DocumentContextExtractor
from llama_index.core.storage.docstore import SimpleDocumentStore

docstore = SimpleDocumentStore()
# Add documents to docstore...

extractor = DocumentContextExtractor(
    docstore=docstore,
    llm=my_llm,
    max_context_length=64000,
    max_output_tokens=256,
)

# Extract context for nodes asynchronously
metadata_list = await extractor.aextract(nodes)

Using Succinct Prompt

extractor = DocumentContextExtractor(
    docstore=docstore,
    llm=my_llm,
    prompt=DocumentContextExtractor.SUCCINCT_CONTEXT_PROMPT,
    max_context_length=128000,
    oversized_document_strategy="ignore",
)

Related Pages

Environment:Run_llama_Llama_index_Python_LlamaIndex_Core

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment