Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Deepset ai Haystack Retrieval MRR Evaluation

From Leeroopedia

Overview

Mean Reciprocal Rank (MRR) measures retrieval quality by the position of the first relevant document in a ranked list of results. It is one of the most widely used metrics in information retrieval evaluation, rewarding systems that place relevant documents early in the result set.

Domains

  • Evaluation
  • Information_Retrieval

Theoretical Foundation

MRR is defined as the average of the reciprocal ranks across all queries:

MRR = (1/Q) * sum(1/rank_i) for i = 1..Q

Where:

  • Q is the total number of queries evaluated.
  • rank_i is the position (1-indexed) of the first relevant document in the retrieved results for query i.
  • If no relevant document is found in the retrieved results for a given query, the reciprocal rank for that query is 0.0.

Key Properties

  • Focus on first relevant result: MRR only considers the rank of the first relevant document. It does not account for the positions of additional relevant documents in the result set.
  • Score range: MRR scores range from 0.0 to 1.0. A score of 1.0 means every query had its first relevant document at rank 1.
  • Sensitivity to top positions: The metric is heavily weighted toward top positions. A relevant document at rank 1 contributes 1.0, at rank 2 contributes 0.5, at rank 3 contributes 0.333, and so on.

When to Use MRR

MRR is ideal for scenarios where the user is primarily interested in finding one correct result as quickly as possible:

  • Question answering: Where a single correct passage suffices.
  • Navigational queries: Where the user seeks one specific document.
  • Retrieval-Augmented Generation (RAG): Where finding at least one relevant context document early is critical for answer quality.

Limitations

  • Does not reward finding multiple relevant documents. For that, use MAP or Recall.
  • Does not differentiate between finding the second, third, or fourth relevant document. Only the first matters.
  • Sensitive to the definition of "relevance" -- uses content-based matching, so documents must be normalized consistently.

Relationship to Implementation

In the Haystack framework, this principle is realized by the DocumentMRREvaluator component, which:

  • Accepts lists of ground truth documents and retrieved documents per query.
  • Computes the reciprocal rank for each query based on content matching.
  • Returns both individual per-query scores and the aggregated MRR score.

Related Principles

  • Retrieval MAP Evaluation -- considers the full ranking of all relevant documents.
  • Retrieval Recall Evaluation -- measures the proportion of relevant documents retrieved.

References

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment