Implementation:Deepset ai Haystack DocumentMRREvaluator
Overview
DocumentMRREvaluator is a Haystack evaluator component that calculates the Mean Reciprocal Rank (MRR) of retrieved documents. It measures how high the first relevant document is ranked in the retrieved results for each query.
Implements Principle
Principle:Deepset_ai_Haystack_Retrieval_MRR_Evaluation
Source Location
haystack/components/evaluators/document_mrr.py (Lines 11-84)
Import
from haystack.components.evaluators import DocumentMRREvaluator
Component Registration
DocumentMRREvaluator is decorated with @component, making it a standard Haystack pipeline component.
API
Constructor
DocumentMRREvaluator requires no initialization parameters:
evaluator = DocumentMRREvaluator()
run()
def run(
self,
ground_truth_documents: list[list[Document]],
retrieved_documents: list[list[Document]]
) -> dict[str, Any]:
Parameters:
- ground_truth_documents (
list[list[Document]]) -- A list of expected documents for each question. Each inner list contains the ground truth documents for one query. - retrieved_documents (
list[list[Document]]) -- A list of retrieved documents for each question. Each inner list contains the documents returned by the retriever for one query.
Returns: A dictionary with the following keys:
- score (
float) -- The average MRR score across all queries. - individual_scores (
list[float]) -- A list of reciprocal rank scores (0.0 to 1.0) for each query.
Raises:
ValueError-- Ifground_truth_documentsandretrieved_documentshave different lengths.
Algorithm
For each query:
- Extract the content strings from all ground truth documents (skipping those with
Nonecontent). - Iterate through the retrieved documents in order (by rank).
- When the first retrieved document with content matching a ground truth document is found, compute the reciprocal rank as
1 / (rank + 1)(rank is 0-indexed). - If no match is found, the reciprocal rank is 0.0.
The final MRR score is the average of all individual reciprocal ranks.
Usage Example
from haystack import Document
from haystack.components.evaluators import DocumentMRREvaluator
evaluator = DocumentMRREvaluator()
result = evaluator.run(
ground_truth_documents=[
[Document(content="France")],
[Document(content="9th century"), Document(content="9th")],
],
retrieved_documents=[
[Document(content="France")],
[Document(content="9th century"), Document(content="10th century"), Document(content="9th")],
],
)
print(result["individual_scores"])
# [1.0, 1.0]
print(result["score"])
# 1.0
Integration in Evaluation Pipelines
DocumentMRREvaluator can be used as a standalone component or within an evaluation pipeline alongside other evaluators:
from haystack import Pipeline
from haystack.components.evaluators import DocumentMRREvaluator, DocumentMAPEvaluator
eval_pipeline = Pipeline()
eval_pipeline.add_component("mrr_evaluator", DocumentMRREvaluator())
eval_pipeline.add_component("map_evaluator", DocumentMAPEvaluator())
results = eval_pipeline.run({
"mrr_evaluator": {
"ground_truth_documents": ground_truths,
"retrieved_documents": retrieved_docs,
},
"map_evaluator": {
"ground_truth_documents": ground_truths,
"retrieved_documents": retrieved_docs,
},
})
Important Notes
- No normalization: DocumentMRREvaluator does not normalize its inputs. Use the
DocumentCleanercomponent to clean and normalize documents before passing them to this evaluator. - Content-based matching: Matching is performed on the
contentattribute of Document objects. Documents withNonecontent are skipped. - Deterministic: The evaluator is fully deterministic and requires no external services or models.
Dependencies
haystackcore library (Document, component decorator)- No external dependencies required.