Workflow:Deepset ai Haystack Extractive QA Pipeline
| Knowledge Sources | |
|---|---|
| Domains | NLP, Question_Answering, Information_Extraction |
| Last Updated | 2026-02-11 20:00 GMT |
Overview
End-to-end process for extracting precise answer spans from documents using a retriever and a Transformer-based extractive reader model.
Description
This workflow implements an extractive question answering pipeline that finds exact answer spans within documents rather than generating new text. It uses a BM25 retriever to find candidate documents and an ExtractiveReader (e.g., deepset/tinyroberta-squad2) to identify and score answer spans within those documents. Each answer includes the extracted text, a confidence score, document offsets, and a reference to the source document. The pipeline also returns a special "no answer" candidate with its own score, enabling the system to abstain when no confident answer exists.
Usage
Execute this workflow when you need precise, traceable answers extracted directly from source text rather than LLM-generated paraphrases. This is ideal for compliance-sensitive applications, legal document analysis, technical documentation search, or any use case where answer provenance and exact source attribution are critical.
Execution Steps
Step 1: Initialize Document Store
Create an InMemoryDocumentStore and populate it with the documents that will serve as the answer source. Documents are written directly to the store (no embedding required for BM25 retrieval).
Key considerations:
- BM25 retrieval operates on raw text content
- No embedding generation needed for this pipeline variant
Step 2: Configure BM25 Retriever
Instantiate an InMemoryBM25Retriever connected to the document store. The retriever fetches the most relevant documents based on keyword matching against the query.
Key considerations:
- top_k controls how many candidate documents are passed to the reader
- Higher top_k increases recall but slows reader inference
Step 3: Configure Extractive Reader
Instantiate an ExtractiveReader with a pre-trained question answering model (e.g., deepset/tinyroberta-squad2). The reader examines each retrieved document and extracts candidate answer spans with confidence scores.
Key considerations:
- Model warm-up is required before first inference
- Returns multiple candidate answers ranked by confidence score
- Always includes a "no answer" candidate as the last result
- Supports model_kwargs for HuggingFace model configuration
Step 4: Connect and Run Pipeline
Connect the retriever output to the reader input and execute the pipeline. The query must be provided to both the retriever (for document selection) and the reader (for span extraction).
Pseudocode:
Connect retriever -> reader Run with query for both retriever and reader components
Step 5: Process Extracted Answers
Interpret the pipeline output which contains a list of ExtractedAnswer objects. Each answer includes the extracted text span, confidence score, character offsets within the source document, and a reference to the source document for traceability.
Key considerations:
- Answers are ranked by confidence score (highest first)
- The last answer is always the "no answer" candidate (data is None)
- Compare top answer score against no_answer score for confidence thresholding
- Each answer includes document_offset for exact position tracking