Workflow:Deepset ai Haystack Extractive QA Pipeline

Knowledge Sources	Haystack Haystack Docs
Domains	NLP, Question_Answering, Information_Extraction
Last Updated	2026-02-11 20:00 GMT

Overview

End-to-end process for extracting precise answer spans from documents using a retriever and a Transformer-based extractive reader model.

Description

This workflow implements an extractive question answering pipeline that finds exact answer spans within documents rather than generating new text. It uses a BM25 retriever to find candidate documents and an ExtractiveReader (e.g., deepset/tinyroberta-squad2) to identify and score answer spans within those documents. Each answer includes the extracted text, a confidence score, document offsets, and a reference to the source document. The pipeline also returns a special "no answer" candidate with its own score, enabling the system to abstain when no confident answer exists.

Usage

Execute this workflow when you need precise, traceable answers extracted directly from source text rather than LLM-generated paraphrases. This is ideal for compliance-sensitive applications, legal document analysis, technical documentation search, or any use case where answer provenance and exact source attribution are critical.

Execution Steps

Step 1: Initialize Document Store

Create an InMemoryDocumentStore and populate it with the documents that will serve as the answer source. Documents are written directly to the store (no embedding required for BM25 retrieval).

Key considerations:

BM25 retrieval operates on raw text content
No embedding generation needed for this pipeline variant

Step 2: Configure BM25 Retriever

Instantiate an InMemoryBM25Retriever connected to the document store. The retriever fetches the most relevant documents based on keyword matching against the query.

Key considerations:

top_k controls how many candidate documents are passed to the reader
Higher top_k increases recall but slows reader inference

Step 3: Configure Extractive Reader

Instantiate an ExtractiveReader with a pre-trained question answering model (e.g., deepset/tinyroberta-squad2). The reader examines each retrieved document and extracts candidate answer spans with confidence scores.

Key considerations:

Model warm-up is required before first inference
Returns multiple candidate answers ranked by confidence score
Always includes a "no answer" candidate as the last result
Supports model_kwargs for HuggingFace model configuration

Step 4: Connect and Run Pipeline

Connect the retriever output to the reader input and execute the pipeline. The query must be provided to both the retriever (for document selection) and the reader (for span extraction).

Pseudocode:

Connect retriever -> reader
Run with query for both retriever and reader components

Step 5: Process Extracted Answers

Interpret the pipeline output which contains a list of ExtractedAnswer objects. Each answer includes the extracted text span, confidence score, character offsets within the source document, and a reference to the source document for traceability.

Key considerations:

Answers are ranked by confidence score (highest first)
The last answer is always the "no answer" candidate (data is None)
Compare top answer score against no_answer score for confidence thresholding
Each answer includes document_offset for exact position tracking

Execution Diagram

GitHub URL

Workflow Repository