Implementation:Explodinggradients Ragas SummarizationScore Metric
| Field | Value |
|---|---|
| source | Repo |
| domains | Metrics, Evaluation |
| last_updated | 2026-02-10 |
Overview
SummarizationScore evaluates the quality of text summaries by extracting keyphrases from the source, generating closed-ended questions, and scoring how well the summary answers those questions.
Description
The SummarizationScore class uses a three-step LLM pipeline to evaluate summaries:
- Keyphrase Extraction -- Extracts named entities and keyphrases (persons, organizations, locations, dates, monetary values, percentages) from the source text using the
ExtractKeyphrasePrompt. - Question Generation -- Generates closed-ended (yes/no) questions based on the source text and extracted keyphrases using the
GenerateQuestionsPrompt. - Answer Generation -- Evaluates whether the summary contains sufficient information to answer each question using the
GenerateAnswersPrompt.
The final score combines a QA score (ratio of answerable questions) with an optional conciseness penalty based on summary length relative to the source text. It inherits from MetricWithLLM and SingleTurnMetric.
Key attributes:
- length_penalty -- Whether to apply a conciseness penalty (default
True). - coeff -- Weight for the conciseness score in the final combination (default
0.5). - extract_keyphrases_prompt -- Prompt for keyphrase extraction.
- question_generation_prompt -- Prompt for question generation.
- answer_generation_prompt -- Prompt for answer evaluation.
Usage
The metric requires reference_contexts (the source text as a list of context strings) and response (the summary). An LLM must be configured.
Code Reference
| Property | Value |
|---|---|
| Source Location | src/ragas/metrics/_summarization.py L143-241
|
| Class Signature | class SummarizationScore(MetricWithLLM, SingleTurnMetric)
|
| Import | from ragas.metrics import SummarizationScore
|
I/O Contract
Inputs
| Parameter | Type | Required | Description |
|---|---|---|---|
| reference_contexts | List[str] | Yes | The source text passages to be summarized |
| response | str | Yes | The generated summary to evaluate |
Outputs
| Output | Type | Description |
|---|---|---|
| score | float | Combined QA score and optional conciseness score (0.0 to 1.0) |
Usage Examples
from ragas.metrics import SummarizationScore
from ragas.dataset_schema import SingleTurnSample
metric = SummarizationScore(length_penalty=True, coeff=0.5)
# metric.llm = ... # Set your LLM
sample = SingleTurnSample(
reference_contexts=[
"Apple Inc. is a technology company based in Cupertino, California. Founded by Steve Jobs in 1976, it reached a market capitalization of $3 trillion in 2023."
],
response="Apple Inc., founded in 1976, is a major tech company based in California."
)
# score = await metric.single_turn_ascore(sample)
A pre-configured instance is available:
from ragas.metrics._summarization import summarization_score
Related Pages
- Explodinggradients_Ragas_Faithfulness_Metric -- Statement-level faithfulness evaluation
- Explodinggradients_Ragas_ContextRecall_Metric -- Context recall based on statement attribution
- Explodinggradients_Ragas_FactualCorrectness_Metric -- Claim decomposition-based factual evaluation