Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Explodinggradients Ragas Metric Get Correlation

From Leeroopedia


Metric Get Correlation

Metric Get Correlation implements the Metric Baseline Correlation principle in the Ragas evaluation toolkit. The get_correlation() method is defined on both DiscreteMetric and NumericMetric, each using a statistical measure appropriate to its output type.

Source Locations

  • DiscreteMetric.get_correlation: src/ragas/metrics/discrete.py, Lines 75-89
  • NumericMetric.get_correlation: src/ragas/metrics/numeric.py, Lines 73-93

Import

from ragas.metrics import DiscreteMetric
from ragas.metrics.numeric import NumericMetric

DiscreteMetric.get_correlation

Signature (Lines 75-89)

def get_correlation(
    self,
    gold_labels: List[str],
    predictions: List[str],
) -> float

Parameters

Parameter Type Description
gold_labels List[str] Human-annotated ground-truth labels (categorical strings).
predictions List[str] Metric-predicted labels (categorical strings).

Return Value

Returns a float representing Cohen's Kappa score, ranging from -1 (complete disagreement) to 1 (perfect agreement), with 0 indicating chance-level agreement.

Implementation

def get_correlation(
    self, gold_labels: t.List[str], predictions: t.List[str]
) -> float:
    try:
        from sklearn.metrics import cohen_kappa_score
    except ImportError:
        raise ImportError(
            "scikit-learn is required for correlation calculation. "
            "Please install it with `pip install scikit-learn`."
        )
    return cohen_kappa_score(gold_labels, predictions)

Uses scikit-learn's cohen_kappa_score which:

  • Computes the observed agreement and expected chance agreement.
  • Returns the chance-corrected agreement coefficient.
  • Handles multi-class categorical labels (not just binary).

External Dependency

Requires scikit-learn. If not installed, raises ImportError with installation instructions.

Usage Example

from ragas.metrics import DiscreteMetric

metric = DiscreteMetric(
    name="quality_check",
    prompt="Check quality: {response}",
    allowed_values=["pass", "fail"],
)

gold = ["pass", "fail", "pass", "pass", "fail"]
preds = ["pass", "fail", "fail", "pass", "fail"]

kappa = metric.get_correlation(gold_labels=gold, predictions=preds)
print(f"Cohen's Kappa: {kappa:.3f}")
# Positive kappa indicates above-chance agreement

NumericMetric.get_correlation

Signature (Lines 73-93)

def get_correlation(
    self,
    gold_labels: List[str],
    predictions: List[str],
) -> float

Parameters

Parameter Type Description
gold_labels List[str] Human-annotated ground-truth scores (as strings, converted to floats internally).
predictions List[str] Metric-predicted scores (as strings, converted to floats internally).

Return Value

Returns a float representing the Pearson correlation coefficient, ranging from -1 (perfect negative correlation) to 1 (perfect positive correlation), with 0 indicating no linear correlation.

Implementation

def get_correlation(
    self, gold_labels: t.List[str], predictions: t.List[str]
) -> float:
    try:
        from scipy.stats import pearsonr
    except ImportError:
        raise ImportError(
            "scipy is required for correlation calculation. "
            "Please install it with `pip install scipy`."
        )
    # Convert strings to floats for correlation calculation
    gold_floats = [float(x) for x in gold_labels]
    pred_floats = [float(x) for x in predictions]
    result = pearsonr(gold_floats, pred_floats)
    # pearsonr returns (correlation, p-value) tuple
    correlation = t.cast(float, result[0])
    return correlation

Key details:

  • Converts string inputs to floats before computing correlation.
  • Uses scipy.stats.pearsonr which returns a tuple of (correlation, p-value); only the correlation coefficient is returned.
  • The p-value (indicating statistical significance) is discarded.

External Dependency

Requires scipy. If not installed, raises ImportError with installation instructions.

Usage Example

from ragas.metrics.numeric import NumericMetric

metric = NumericMetric(
    name="relevance_score",
    prompt="Rate relevance of {response} to {user_input}",
    allowed_values=(0.0, 1.0),
)

gold = ["0.9", "0.3", "0.7", "0.5", "0.8"]
preds = ["0.85", "0.4", "0.65", "0.55", "0.75"]

r = metric.get_correlation(gold_labels=gold, predictions=preds)
print(f"Pearson r: {r:.3f}")
# High positive r indicates strong linear agreement

Comparison of Methods

Aspect DiscreteMetric NumericMetric
Statistical measure Cohen's Kappa Pearson correlation
Input type Categorical strings Numeric strings (converted to float)
Range [-1, 1] [-1, 1]
Chance correction Yes N/A (measures linear relationship)
External dependency scikit-learn scipy
Best for Pass/fail, multi-class labels Continuous scores (e.g., 0.0-1.0)

Class Context

Both DiscreteMetric and NumericMetric inherit from SimpleLLMMetric:

SimpleLLMMetric
  ├── DiscreteMetric (+ DiscreteValidator)
  │     └── get_correlation() -> Cohen's Kappa
  └── NumericMetric (+ NumericValidator)
        └── get_correlation() -> Pearson r

The get_correlation method is not abstract in the parent class; it is defined independently on each subclass with the appropriate statistical measure.

Implements

See Also

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment