Principle:Explodinggradients Ragas Optimization Loss Functions

Optimization Loss Functions

Optimization Loss Functions is a principle in the Ragas evaluation toolkit that defines how the alignment between metric predictions and human judgments is quantified during prompt optimization.

Motivation

When optimizing evaluation metric prompts (via Genetic Prompt Optimization or DSPy Prompt Optimization), the optimizer needs a numerical signal to distinguish better candidates from worse ones. Loss functions serve as this signal by comparing the metric's output for each sample against the human-annotated ground-truth label. A lower loss (or equivalently, higher fitness) indicates that the metric's prompt is producing outputs that more closely match human judgment.

Theoretical Foundation

Role of Loss in Optimization

In the context of prompt optimization, a loss function $L (y_{pred}, y_{true})$ takes two vectors:

$y_{pred}$ -- The metric's predicted labels for each sample in the annotated dataset.
$y_{true}$ -- The human-annotated ground-truth labels.

The loss function returns a scalar value that quantifies the disagreement between these vectors. The optimizer's goal is to find the prompt instructions that minimize this loss (or, in the case of accuracy-based losses, maximize the return value).

Discrete vs. Continuous Metrics

Different types of metrics require different loss functions:

Discrete/binary metrics produce categorical outputs such as "pass" or "fail". For these, classification-oriented loss functions like accuracy or F1-score are appropriate. These measure the proportion of correct categorical predictions.
Numeric/continuous metrics produce real-valued scores within a range. For these, regression-oriented loss functions like Mean Squared Error (MSE) measure how far the predicted values deviate from the ground truth.

BinaryMetricLoss

BinaryMetricLoss supports two reduction modes:

Accuracy -- The fraction of predictions that exactly match the ground-truth labels: $accuracy = \frac{1}{n} \sum_{i = 1}^{n} 𝟙 (y_{pred, i} = y_{true, i})$ . Higher values indicate better alignment.
F1-score -- The harmonic mean of precision and recall, which balances false positives and false negatives: $F_{1} = \frac{2 \cdot precision \cdot recall}{precision + recall}$ . This is particularly useful when the class distribution in the annotations is imbalanced.

MSELoss

Mean Squared Error computes the average squared difference between predicted and actual values: $MSE = \frac{1}{n} \sum_{i = 1}^{n} (y_{pred, i} - y_{true, i})^{2}$ . MSELoss also supports a "sum" reduction that returns the total squared error without averaging. Lower MSE values indicate predictions closer to the ground truth.

Abstract Interface

All loss functions in Ragas inherit from an abstract base class Loss that enforces a callable interface. This allows the optimizer to use any loss function interchangeably. The optimizer calls the loss as a function: loss_fn(y_pred, y_true).

Choosing a Loss Function

The choice of loss function should match the metric's output type:

Metric Output Type	Recommended Loss	Rationale
Binary / Discrete	`BinaryMetricLoss(metric="accuracy")`	Straightforward for balanced datasets.
Binary / Discrete (imbalanced)	`BinaryMetricLoss(metric="f1_score")`	Accounts for class imbalance.
Numeric / Continuous	`MSELoss(reduction="mean")`	Standard regression loss.

Relationship to Optimizers

Both the GeneticOptimizer and DSPyOptimizer accept a Loss instance as a required parameter. The genetic optimizer calls the loss function in its evaluate_fitness stage to score candidates. The DSPy optimizer wraps the loss function into a DSPy-compatible metric for MIPROv2.

Implemented By

Implementation: Loss Classes

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment