Implementation:Online ml River Metrics MacroAverage
| Knowledge Sources | |
|---|---|
| Domains | Online_Learning, Evaluation_Metrics, Multi_Output_Learning |
| Last Updated | 2026-02-08 16:00 GMT |
Overview
Macro-average wrapper computing unweighted arithmetic mean of per-output metric values.
Description
MacroAverage wraps any single-output metric (classification or regression) to work with multi-output problems. It maintains an independent copy of the metric for each output label/target, updates each copy separately, and returns the arithmetic mean of all metric values. This gives equal weight to each output regardless of frequency or support, making it suitable when all outputs are equally important.
Usage
Use MacroAverage when you want each output to contribute equally to the overall score, regardless of how often it appears. This is appropriate when all outputs/labels are equally important and you don't want frequent outputs to dominate the metric. For example, in multi-label classification with imbalanced label frequencies, MacroAverage ensures rare labels influence the final score as much as common ones.
Code Reference
Source Location
- Repository: Online_ml_River
- File: river/metrics/multioutput/macro.py
Signature
class MacroAverage(MultiOutputMetric, metrics.base.WrapperMetric):
def __init__(self, metric):
# metric: Any classification or regression metric
pass
Import
from river import metrics
I/O Contract
| Method | Parameters | Returns | Description |
|---|---|---|---|
| update | y_true (dict), y_pred (dict), [w] | None | Updates all per-output metric copies |
| get | - | float | Returns arithmetic mean of all metric values |
Usage Examples
from river import metrics
# Wrap F1 score for multi-output use
macro_f1 = metrics.multioutput.MacroAverage(metrics.F1())
y_true = [
{0: False, 1: True, 2: True},
{0: True, 1: True, 2: False},
{0: True, 1: False, 2: True},
]
y_pred = [
{0: False, 1: True, 2: True}, # All correct
{0: True, 1: False, 2: False}, # Label 1 wrong
{0: False, 1: False, 2: True}, # Label 0 wrong
]
for yt, yp in zip(y_true, y_pred):
macro_f1.update(yt, yp)
print(macro_f1)
# Returns mean F1 across all three labels
# Access individual output metrics
print(f"Number of outputs tracked: {len(macro_f1.metrics)}")
for output_id, metric in macro_f1.metrics.items():
print(f"Output {output_id}: {metric}")
# Works with any metric - example with Precision
macro_precision = metrics.multioutput.MacroAverage(metrics.Precision())
for yt, yp in zip(y_true, y_pred):
macro_precision.update(yt, yp)
print(macro_precision)
# Average precision across all outputs
# Regression example
macro_mae = metrics.multioutput.MacroAverage(metrics.MAE())
y_true_reg = [
{0: 1.0, 1: 2.0, 2: 3.0},
{0: 2.0, 1: 3.0, 2: 4.0},
]
y_pred_reg = [
{0: 1.1, 1: 2.2, 2: 2.9},
{0: 2.1, 1: 2.8, 2: 4.2},
]
for yt, yp in zip(y_true_reg, y_pred_reg):
macro_mae.update(yt, yp)
print(f"Macro MAE: {macro_mae.get():.3f}")
# Average MAE across all three outputs