Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Vibrantlabsai Ragas ChrfScoreV2

From Leeroopedia
Knowledge Sources
Domains Evaluation, Metrics
Last Updated 2026-02-12 00:00 GMT

Overview

Calculates the CHRF (Character F-score) between a reference and response text, providing a character n-gram based evaluation metric that correlates well with human judgments for text quality.

Description

The CHRFScore metric computes the Character F-score (CHRF) between a reference text and a response text. Unlike BLEU which operates on word-level n-grams, CHRF operates on character-level n-grams, making it more robust to morphological variations and better suited for morphologically rich languages.

The implementation delegates to the sacrebleu library's corpus_chrf function for consistent and reproducible scoring. The raw sacrebleu score (on a 0-100 scale) is normalized to the 0.0 to 1.0 range by dividing by 100.

The metric handles several edge cases gracefully:

  • If either input is not a string, it returns 0.0 with an explanatory reason.
  • If either input is empty or contains only whitespace, it returns 0.0 with an explanatory reason.

Additional sacrebleu parameters (such as char_order, word_order, beta, eps_smoothing) can be passed via the kwargs constructor parameter for fine-grained control over the scoring behavior.

This metric does not require an LLM or embedding model, making it fast and deterministic.

Usage

Use CHRFScore when you need a character-level evaluation metric for comparing generated text against reference text. It is particularly useful for machine translation evaluation, text summarization, and any scenario where morphological variation matters. It provides a more fine-grained comparison than word-level metrics like BLEU.

This is the V2 collections version, providing automatic validation and a consistent async API via the BaseMetric base class.

Code Reference

Source Location

Signature

class CHRFScore(BaseMetric):
    def __init__(
        self,
        name: str = "chrf_score",
        kwargs: t.Optional[t.Dict[str, t.Any]] = None,
        **base_kwargs,
    ): ...

    async def ascore(self, reference: str, response: str) -> MetricResult: ...

Import

from ragas.metrics.collections import CHRFScore

I/O Contract

Constructor Parameters

Name Type Required Description
name str No Metric name (default: "chrf_score")
kwargs Dict[str, Any] or None No Additional arguments passed to sacrebleu.corpus_chrf (e.g., char_order, word_order, beta, eps_smoothing)

Inputs

Name Type Required Description
reference str Yes The reference/ground truth text
response str Yes The response/hypothesis text to evaluate

Outputs

Name Type Description
score MetricResult (float value) CHRF score between 0.0 and 1.0. Higher indicates greater character n-gram overlap. May include a reason string if input validation fails

Usage Examples

Basic Usage

from ragas.metrics.collections import CHRFScore

metric = CHRFScore()

result = await metric.ascore(
    reference="The capital of France is Paris.",
    response="Paris is the capital of France."
)
print(f"CHRF Score: {result.value}")

With Custom sacrebleu Parameters

from ragas.metrics.collections import CHRFScore

# Customize character n-gram order and word order
metric = CHRFScore(kwargs={"char_order": 6, "word_order": 2, "beta": 2})

result = await metric.ascore(
    reference="Albert Einstein was born in 1879.",
    response="Einstein was born in the year 1879."
)
print(f"CHRF Score: {result.value}")

Batch Scoring

from ragas.metrics.collections import CHRFScore

metric = CHRFScore()

results = await metric.abatch_score([
    {"reference": "The cat sat on the mat.", "response": "A cat was sitting on a mat."},
    {"reference": "Hello world.", "response": "Hi world."},
])
for r in results:
    print(f"Score: {r.value}")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment