| Field |
Value
|
| source |
Repo
|
| domains |
Metrics, NLP
|
| last_updated |
2026-02-10 00:00 GMT
|
Overview
RougeScore is a v2 class-based metric that calculates the ROUGE (Recall-Oriented Understudy for Gisting Evaluation) score between reference and response texts using the rouge_score library.
Description
RougeScore extends BaseMetric to provide an async-first ROUGE score implementation. ROUGE is a recall-oriented metric family widely used in summarization and text generation evaluation. This metric supports two ROUGE variants -- rouge1 (unigram overlap) and rougeL (longest common subsequence) -- and three scoring modes: fmeasure, precision, and recall. It uses the rouge_score.rouge_scorer.RougeScorer with stemming enabled. No LLM or embedding components are required.
Usage
Instantiate with optional rouge_type and mode parameters. Call ascore(reference, response) for single evaluations or abatch_score(inputs) for batch evaluations. Requires the rouge_score package.
Code Reference
| Property |
Value
|
| Source Location |
src/ragas/metrics/collections/_rouge_score.py L1--86
|
| Signature |
class RougeScore(BaseMetric)
|
| Import |
from ragas.metrics.collections import RougeScore
|
I/O Contract
Inputs
| Parameter |
Type |
Required |
Description
|
reference |
str |
Yes |
The reference / ground truth text
|
response |
str |
Yes |
The response text to evaluate
|
Constructor Parameters
| Parameter |
Type |
Default |
Description
|
name |
str |
"rouge_score" |
Metric name
|
rouge_type |
Literal["rouge1", "rougeL"] |
"rougeL" |
ROUGE variant to compute
|
mode |
Literal["fmeasure", "precision", "recall"] |
"fmeasure" |
Scoring mode
|
Outputs
| Field |
Type |
Description
|
MetricResult.value |
float |
ROUGE score in range 0.0--1.0
|
Usage Examples
from ragas.metrics.collections import RougeScore
# Default: rougeL with fmeasure
metric = RougeScore()
result = await metric.ascore(
reference="The capital of France is Paris.",
response="Paris is the capital of France."
)
print(f"ROUGE-L F1: {result.value}")
# ROUGE-1 with recall mode
metric_r1 = RougeScore(rouge_type="rouge1", mode="recall")
result = await metric_r1.ascore(
reference="The quick brown fox jumps over the lazy dog.",
response="A quick brown fox jumped over the lazy dog."
)
print(f"ROUGE-1 Recall: {result.value}")
# Batch evaluation
results = await metric.abatch_score([
{"reference": "Text one.", "response": "Response one."},
{"reference": "Text two.", "response": "Response two."},
])
Related Pages
Page Connections
Double-click a node to navigate. Hold to expand connections.