Implementation:Deepset ai Haystack DocumentMAPEvaluator
Overview
DocumentMAPEvaluator is a Haystack evaluator component that calculates the Mean Average Precision (MAP) of retrieved documents. It measures how highly all retrieved relevant documents are ranked across queries.
Implements Principle
Principle:Deepset_ai_Haystack_Retrieval_MAP_Evaluation
Source Location
haystack/components/evaluators/document_map.py (Lines 11-90)
Import
from haystack.components.evaluators import DocumentMAPEvaluator
Component Registration
DocumentMAPEvaluator is decorated with @component, making it a standard Haystack pipeline component.
API
Constructor
DocumentMAPEvaluator requires no initialization parameters:
evaluator = DocumentMAPEvaluator()
run()
def run(
self,
ground_truth_documents: list[list[Document]],
retrieved_documents: list[list[Document]]
) -> dict[str, Any]:
Parameters:
- ground_truth_documents (
list[list[Document]]) -- A list of expected documents for each question. Each inner list contains the ground truth documents for one query. - retrieved_documents (
list[list[Document]]) -- A list of retrieved documents for each question. Each inner list contains the documents returned by the retriever for one query.
Returns: A dictionary with the following keys:
- score (
float) -- The average MAP score across all queries. - individual_scores (
list[float]) -- A list of Average Precision scores (0.0 to 1.0) for each query.
Raises:
ValueError-- Ifground_truth_documentsandretrieved_documentshave different lengths.
Algorithm
For each query:
- Extract the content strings from all ground truth documents (skipping those with
Nonecontent). - Initialize a running count of relevant documents found (
relevant_documents) and a numerator accumulator (average_precision_numerator). - Iterate through retrieved documents in rank order. For each retrieved document whose content matches a ground truth document:
- Increment
relevant_documents. - Add
relevant_documents / (rank + 1)to the numerator (this is precision at the current rank).
- Increment
- Compute Average Precision as
average_precision_numerator / relevant_documents(or 0.0 if no relevant documents were found).
The final MAP score is the average of all individual Average Precision values.
Usage Example
from haystack import Document
from haystack.components.evaluators import DocumentMAPEvaluator
evaluator = DocumentMAPEvaluator()
result = evaluator.run(
ground_truth_documents=[
[Document(content="France")],
[Document(content="9th century"), Document(content="9th")],
],
retrieved_documents=[
[Document(content="France")],
[Document(content="9th century"), Document(content="10th century"), Document(content="9th")],
],
)
print(result["individual_scores"])
# [1.0, 0.8333333333333333]
print(result["score"])
# 0.9166666666666666
In this example, the second query has two relevant documents at ranks 1 and 3 (with an irrelevant document at rank 2), yielding an AP of (1/2) * (1/1 + 2/3) = 0.833.
Integration in Evaluation Pipelines
DocumentMAPEvaluator can be combined with other evaluators in a pipeline:
from haystack import Pipeline
from haystack.components.evaluators import DocumentMAPEvaluator, DocumentMRREvaluator
eval_pipeline = Pipeline()
eval_pipeline.add_component("map_evaluator", DocumentMAPEvaluator())
eval_pipeline.add_component("mrr_evaluator", DocumentMRREvaluator())
results = eval_pipeline.run({
"map_evaluator": {
"ground_truth_documents": ground_truths,
"retrieved_documents": retrieved_docs,
},
"mrr_evaluator": {
"ground_truth_documents": ground_truths,
"retrieved_documents": retrieved_docs,
},
})
Important Notes
- No normalization: DocumentMAPEvaluator does not normalize its inputs. Use the
DocumentCleanercomponent to clean and normalize documents before passing them to this evaluator. - Content-based matching: Matching is performed on the
contentattribute of Document objects. Documents withNonecontent are skipped. - Deterministic: The evaluator is fully deterministic and requires no external services or models.
Dependencies
haystackcore library (Document, component decorator)- No external dependencies required.