Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Explodinggradients Ragas BleuScore Metric

From Leeroopedia


Field Value
source Repo
domains Metrics, NLP
last_updated 2026-02-10

Overview

BleuScore computes the BLEU (Bilingual Evaluation Understudy) n-gram precision score between a reference and a generated response using the sacrebleu library.

Description

The BleuScore class evaluates text generation quality by computing BLEU score, which measures n-gram overlap between a reference and response. The implementation splits both reference and response into sentences (by ". " delimiter), then delegates to sacrebleu.corpus_bleu. The score is normalized to a 0-1 range by dividing by 100. This metric does not require an LLM and inherits only from SingleTurnMetric.

Key attributes:

  • kwargs -- Additional keyword arguments passed to the underlying corpus_bleu function.

Dependency: Requires the sacrebleu package (pip install sacrebleu).

Usage

The metric requires reference and response columns.

Code Reference

Property Value
Source Location src/ragas/metrics/_bleu_score.py L11-49
Class Signature class BleuScore(SingleTurnMetric)
Import from ragas.metrics import BleuScore

I/O Contract

Inputs

Parameter Type Required Description
reference str Yes The ground truth reference text
response str Yes The generated response to evaluate

Outputs

Output Type Description
score float BLEU score normalized to 0.0-1.0 range

Usage Examples

from ragas.metrics import BleuScore
from ragas.dataset_schema import SingleTurnSample

metric = BleuScore()

sample = SingleTurnSample(
    reference="The cat sat on the mat.",
    response="The cat is sitting on the mat."
)
# score = await metric.single_turn_ascore(sample)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment