Implementation:Explodinggradients Ragas Metric Get Correlation

Metric Get Correlation

Metric Get Correlation implements the Metric Baseline Correlation principle in the Ragas evaluation toolkit. The get_correlation() method is defined on both DiscreteMetric and NumericMetric, each using a statistical measure appropriate to its output type.

Source Locations

DiscreteMetric.get_correlation: src/ragas/metrics/discrete.py, Lines 75-89
NumericMetric.get_correlation: src/ragas/metrics/numeric.py, Lines 73-93

Import

from ragas.metrics import DiscreteMetric

from ragas.metrics.numeric import NumericMetric

DiscreteMetric.get_correlation

Signature (Lines 75-89)

def get_correlation(
    self,
    gold_labels: List[str],
    predictions: List[str],
) -> float

Parameters

Parameter	Type	Description
`gold_labels`	`List[str]`	Human-annotated ground-truth labels (categorical strings).
`predictions`	`List[str]`	Metric-predicted labels (categorical strings).

Return Value

Returns a float representing Cohen's Kappa score, ranging from -1 (complete disagreement) to 1 (perfect agreement), with 0 indicating chance-level agreement.

Implementation

def get_correlation(
    self, gold_labels: t.List[str], predictions: t.List[str]
) -> float:
    try:
        from sklearn.metrics import cohen_kappa_score
    except ImportError:
        raise ImportError(
            "scikit-learn is required for correlation calculation. "
            "Please install it with `pip install scikit-learn`."
        )
    return cohen_kappa_score(gold_labels, predictions)

Uses scikit-learn's cohen_kappa_score which:

Computes the observed agreement and expected chance agreement.
Returns the chance-corrected agreement coefficient.
Handles multi-class categorical labels (not just binary).

External Dependency

Requires scikit-learn. If not installed, raises ImportError with installation instructions.

Usage Example

from ragas.metrics import DiscreteMetric

metric = DiscreteMetric(
    name="quality_check",
    prompt="Check quality: {response}",
    allowed_values=["pass", "fail"],
)

gold = ["pass", "fail", "pass", "pass", "fail"]
preds = ["pass", "fail", "fail", "pass", "fail"]

kappa = metric.get_correlation(gold_labels=gold, predictions=preds)
print(f"Cohen's Kappa: {kappa:.3f}")
# Positive kappa indicates above-chance agreement

NumericMetric.get_correlation

Signature (Lines 73-93)

def get_correlation(
    self,
    gold_labels: List[str],
    predictions: List[str],
) -> float

Parameters

Parameter	Type	Description
`gold_labels`	`List[str]`	Human-annotated ground-truth scores (as strings, converted to floats internally).
`predictions`	`List[str]`	Metric-predicted scores (as strings, converted to floats internally).

Return Value

Returns a float representing the Pearson correlation coefficient, ranging from -1 (perfect negative correlation) to 1 (perfect positive correlation), with 0 indicating no linear correlation.

Implementation

def get_correlation(
    self, gold_labels: t.List[str], predictions: t.List[str]
) -> float:
    try:
        from scipy.stats import pearsonr
    except ImportError:
        raise ImportError(
            "scipy is required for correlation calculation. "
            "Please install it with `pip install scipy`."
        )
    # Convert strings to floats for correlation calculation
    gold_floats = [float(x) for x in gold_labels]
    pred_floats = [float(x) for x in predictions]
    result = pearsonr(gold_floats, pred_floats)
    # pearsonr returns (correlation, p-value) tuple
    correlation = t.cast(float, result[0])
    return correlation

Key details:

Converts string inputs to floats before computing correlation.
Uses scipy.stats.pearsonr which returns a tuple of (correlation, p-value); only the correlation coefficient is returned.
The p-value (indicating statistical significance) is discarded.

External Dependency

Requires scipy. If not installed, raises ImportError with installation instructions.

Usage Example

from ragas.metrics.numeric import NumericMetric

metric = NumericMetric(
    name="relevance_score",
    prompt="Rate relevance of {response} to {user_input}",
    allowed_values=(0.0, 1.0),
)

gold = ["0.9", "0.3", "0.7", "0.5", "0.8"]
preds = ["0.85", "0.4", "0.65", "0.55", "0.75"]

r = metric.get_correlation(gold_labels=gold, predictions=preds)
print(f"Pearson r: {r:.3f}")
# High positive r indicates strong linear agreement

Comparison of Methods

Aspect	DiscreteMetric	NumericMetric
Statistical measure	Cohen's Kappa	Pearson correlation
Input type	Categorical strings	Numeric strings (converted to float)
Range	[-1, 1]	[-1, 1]
Chance correction	Yes	N/A (measures linear relationship)
External dependency	scikit-learn	scipy
Best for	Pass/fail, multi-class labels	Continuous scores (e.g., 0.0-1.0)

Class Context

Both DiscreteMetric and NumericMetric inherit from SimpleLLMMetric:

SimpleLLMMetric
  ├── DiscreteMetric (+ DiscreteValidator)
  │     └── get_correlation() -> Cohen's Kappa
  └── NumericMetric (+ NumericValidator)
        └── get_correlation() -> Pearson r

The get_correlation method is not abstract in the parent class; it is defined independently on each subclass with the appropriate statistical measure.

Implements

Principle:Explodinggradients_Ragas_Metric_Baseline_Correlation

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment

Metric Get Correlation

Source Locations

Import

DiscreteMetric.get_correlation

Signature (Lines 75-89)

Parameters

Return Value

Implementation

External Dependency

Usage Example

NumericMetric.get_correlation

Signature (Lines 73-93)

Parameters

Return Value

Implementation

External Dependency

Usage Example

Comparison of Methods

Class Context

Implements

See Also

Page Connections