Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Vibrantlabsai Ragas QuotedSpans

From Leeroopedia
Knowledge Sources
Domains Evaluation, Metrics
Last Updated 2026-02-12 00:00 GMT

Overview

The quoted spans alignment metric measures citation accuracy by computing the fraction of quoted text spans in model-generated answers that can be found verbatim in the provided source passages.

Description

This module provides a functional (non-class-based) metric for evaluating citation alignment in model-generated answers. The core idea is that when a model quotes text within quotation marks, those quoted spans should be traceable back to the source documents.

The algorithm works as follows:

  1. Extract Quoted Spans -- A regular expression (_QUOTE_RE) matches text enclosed in straight quotes ("), curly quotes, or other common quotation mark characters. Spans shorter than a configurable minimum word count (min_len, default 3 words) are discarded to avoid spurious matches.
  1. Normalize Text -- Both the quoted spans and source passages undergo light normalization: whitespace is collapsed and text is lowercased (when casefold is True, the default).
  1. Substring Matching -- For each answer, all source passages are joined into a single string. Each extracted quoted span is then checked for substring membership in the normalized source text.
  1. Compute Score -- The final score is the fraction of matched spans over the total number of extracted spans: matched / total. If no quoted spans are found across the entire dataset, the score defaults to 0.0.

The function processes batches of answers and their corresponding source lists, making it suitable for evaluation pipelines.

Usage

Use this metric when evaluating models that produce answers with direct quotations or citations from source documents. It is particularly useful for retrieval-augmented generation (RAG) systems where the model is expected to cite passages verbatim. A high score indicates that the model accurately quotes from its sources, while a low score suggests hallucinated or inaccurate citations.

Code Reference

Source Location

Signature

def quoted_spans_alignment(
    answers: Sequence[str],
    sources: Sequence[Sequence[str]],
    *,
    casefold: bool = True,
    min_len: int = 3,
) -> Dict[str, float]:

Import

from ragas.metrics.quoted_spans import quoted_spans_alignment

I/O Contract

Inputs

Name Type Required Description
answers Sequence[str] Yes List of model-generated answers (length N) potentially containing quoted spans
sources Sequence[Sequence[str]] Yes List of lists (length N) of source passages corresponding to each answer
casefold bool No Whether to normalize text by lowercasing before matching (default: True)
min_len int No Minimum number of words in a quoted span for it to be considered (default: 3)

Outputs

Name Type Description
citation_alignment_quoted_spans float Fraction of quoted spans found verbatim in the sources (0.0 to 1.0)
matched float Number of quoted spans that were matched in the sources
total float Total number of quoted spans extracted from the answers

Internal Helper Functions

Function Purpose
_normalize(text) Collapses whitespace and lowercases text for consistent matching
_extract_quoted_spans(answer, min_len=3) Extracts quoted text spans from an answer using regex, filtering by minimum word count

Usage Examples

Basic Usage

from ragas.metrics.quoted_spans import quoted_spans_alignment

answers = [
    'The report states "climate change is accelerating rapidly" according to the study.',
    'The author noted "economic growth remained steady throughout the quarter" in the analysis.',
]

sources = [
    ["Climate change is accelerating rapidly, with temperatures rising each year."],
    ["Economic growth remained steady throughout the quarter, exceeding expectations."],
]

result = quoted_spans_alignment(answers, sources)
print(result)
# {
#     "citation_alignment_quoted_spans": 1.0,
#     "matched": 2.0,
#     "total": 2.0,
# }

Case-Sensitive Matching

from ragas.metrics.quoted_spans import quoted_spans_alignment

answers = ['He said "The Quick Brown Fox jumped over the lazy dog" in his speech.']
sources = [["the quick brown fox jumped over the lazy dog"]]

# With casefold (default): matches
result_casefold = quoted_spans_alignment(answers, sources, casefold=True)
# citation_alignment_quoted_spans: 1.0

# Without casefold: may not match due to case differences
result_exact = quoted_spans_alignment(answers, sources, casefold=False)
# citation_alignment_quoted_spans: 0.0

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment