Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Deepset ai Haystack ExtractiveReader

From Leeroopedia

Overview

ExtractiveReader is a Haystack pipeline component that performs extractive question answering by locating and extracting answer spans from Documents. It assigns a score to every possible answer span independently of other answer spans, making comparisons across documents easier than implementations that normalize per-document.

Source Location

  • File: haystack/components/readers/extractive.py, lines 25-540+
  • Class: ExtractiveReader
  • Decorator: @component

Import

from haystack.components.readers import ExtractiveReader

Dependencies

  • transformers (Hugging Face Transformers library)
  • torch (PyTorch)
  • accelerate (Hugging Face Accelerate for device management)
  • tokenizers (Hugging Face Tokenizers for encoding handling)
  • sentencepiece (tokenizer backend for some models)

Constructor

ExtractiveReader(
    model: str = "deepset/roberta-base-squad2-distilled",
    device: ComponentDevice | None = None,
    token: Secret | None = Secret.from_env_var(["HF_API_TOKEN", "HF_TOKEN"], strict=False),
    top_k: int = 20,
    score_threshold: float | None = None,
    max_seq_length: int = 384,
    stride: int = 128,
    max_batch_size: int | None = None,
    answers_per_seq: int | None = None,
    no_answer: bool = True,
    calibration_factor: float = 0.1,
    overlap_threshold: float | None = 0.01,
    model_kwargs: dict[str, Any] | None = None,
)

Constructor Parameters

Parameter Type Default Description
model str "deepset/roberta-base-squad2-distilled" Hugging Face QA model identifier or local path
device None None Device on which the model is loaded; auto-selected if None
token None Secret.from_env_var(...) API token for downloading private models from Hugging Face
top_k int 20 Number of answers to return per query
score_threshold None None Minimum probability score for returned answers
max_seq_length int 384 Maximum number of tokens per sequence; longer documents are split
stride int 128 Number of overlapping tokens when splitting sequences
max_batch_size None None Maximum number of samples fed through the model at once
answers_per_seq None None Number of answer candidates per sequence (relevant for split documents)
no_answer bool True Whether to include a "no answer" entry with its confidence score
calibration_factor float 0.1 Factor for calibrating probability scores via sigmoid
overlap_threshold None 0.01 Maximum overlap fraction for answer deduplication; None keeps all
model_kwargs None None Additional kwargs passed to AutoModelForQuestionAnswering.from_pretrained

Run Method

@component.output_types(answers=list[ExtractedAnswer])
def run(
    self,
    query: str,
    documents: list[Document],
    top_k: int | None = None,
    score_threshold: float | None = None,
    max_seq_length: int | None = None,
    stride: int | None = None,
    max_batch_size: int | None = None,
    answers_per_seq: int | None = None,
    no_answer: bool | None = None,
    overlap_threshold: float | None = None,
) -> dict[str, list[ExtractedAnswer]]:

Run Parameters

  • query (str): The question to answer.
  • documents (list[Document]): List of Documents to search for answers.
  • top_k (int | None): Override instance-level top_k for this call.
  • score_threshold (float | None): Override instance-level score_threshold.
  • max_seq_length (int | None): Override instance-level max_seq_length.
  • stride (int | None): Override instance-level stride.
  • no_answer (bool | None): Override instance-level no_answer setting.
  • overlap_threshold (float | None): Override instance-level overlap_threshold.

Output

Returns a dictionary with a single key:

  • "answers": list[ExtractedAnswer] -- Ranked list of extracted answers with scores, document references, and character offsets.

Key Methods

warm_up()

Loads the model and tokenizer from Hugging Face. Must be called before run(). Uses AutoModelForQuestionAnswering.from_pretrained() and AutoTokenizer.from_pretrained().

to_dict() / from_dict()

Serialization and deserialization for pipeline YAML/JSON export. Handles HF model kwargs and token serialization.

deduplicate_by_overlap(answers, overlap_threshold)

Removes overlapping answer spans from the same document. Calculates character-level overlap between answer pairs and removes lower-scoring duplicates that exceed the threshold.

Internal Processing Pipeline

  1. _flatten_documents: Flattens query-document pairs into a single batch axis.
  2. _preprocess: Tokenizes queries and documents with sliding window support. Maps tokens back to query and document IDs.
  3. Model inference: Runs forward pass through AutoModelForQuestionAnswering to get start and end logits.
  4. _postprocess: Converts logits to probabilities using sigmoid with calibration factor. Extracts top-k answer candidates per sequence. Maps token positions back to character offsets.
  5. _nest_answers: Reconstructs nested answer structure. Applies deduplication, top-k filtering, score thresholding, and computes no-answer scores.
  6. _add_answer_page_number: Calculates page number for each answer based on form-feed characters in document content.

Usage Example

from haystack import Document
from haystack.components.readers import ExtractiveReader

docs = [
    Document(content="Python is a popular programming language"),
    Document(content="python ist eine beliebte Programmiersprache"),
]

reader = ExtractiveReader()
reader.warm_up()

question = "What is a popular programming language?"
result = reader.run(query=question, documents=docs)

for answer in result["answers"]:
    if answer.data is not None:
        print(f"Answer: {answer.data} (score: {answer.score:.4f})")
    else:
        print(f"No answer (score: {answer.score:.4f})")

Related Pages

Principle:Deepset_ai_Haystack_Extractive_Question_Answering

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment