Implementation:Online ml River Metrics MutualInfo
| Knowledge Sources | |
|---|---|
| Domains | Online_Learning, Evaluation_Metrics, Clustering |
| Last Updated | 2026-02-08 16:00 GMT |
Overview
Mutual information based metrics for measuring agreement between two clusterings or label assignments.
Description
This module provides three entropy-based metrics: MutualInfo (raw mutual information between two clusterings), NormalizedMutualInfo (MI normalized to [0,1] range using generalized mean of entropies), and AdjustedMutualInfo (MI adjusted for chance agreement). These metrics are symmetric, permutation-invariant, and useful for comparing clustering results or measuring label assignment agreement when ground truth may be unknown.
Usage
Use MutualInfo metrics to evaluate clustering algorithms or compare label assignments. MutualInfo measures raw similarity. NormalizedMutualInfo scales results between 0 (no mutual information) and 1 (perfect mutual information). AdjustedMutualInfo corrects for chance agreement and is preferred when you want to account for random agreement between clusterings. All three are symmetric and work without knowing which clustering is "true."
Code Reference
Source Location
- Repository: Online_ml_River
- File: river/metrics/mutual_info.py
Signature
class MutualInfo(metrics.base.MultiClassMetric):
def __init__(self, cm=None):
pass
class NormalizedMutualInfo(metrics.base.MultiClassMetric):
def __init__(self, cm=None, average_method="arithmetic"):
# average_method: 'min', 'max', 'arithmetic', 'geometric'
pass
class AdjustedMutualInfo(metrics.base.MultiClassMetric):
def __init__(self, cm=None, average_method="arithmetic"):
# average_method: 'min', 'max', 'arithmetic', 'geometric'
pass
Import
from river import metrics
I/O Contract
| Method | Parameters | Returns | Description |
|---|---|---|---|
| update | y_true, y_pred | None | Updates metric with true and predicted cluster labels |
| get | - | float | Returns mutual information score |
Note: These metrics do not support sample weights (works_with_weights returns False).
Usage Examples
from river import metrics
y_true = [1, 1, 2, 2, 3, 3]
y_pred = [1, 1, 1, 2, 2, 2]
# Raw Mutual Information
metric = metrics.MutualInfo()
for yt, yp in zip(y_true, y_pred):
metric.update(yt, yp)
print(metric.get())
# 0.0
# 0.0
# 0.0
# 0.215761
# 0.395752
# 0.462098
# Normalized Mutual Information (scales to 0-1)
metric_norm = metrics.NormalizedMutualInfo()
for yt, yp in zip(y_true, y_pred):
metric_norm.update(yt, yp)
print(metric_norm.get())
# 1.0
# 1.0
# 0.0
# 0.343711
# 0.458065
# 0.515803
# Adjusted Mutual Information (corrects for chance)
metric_adj = metrics.AdjustedMutualInfo()
for yt, yp in zip(y_true, y_pred):
metric_adj.update(yt, yp)
print(metric_adj.get())
# 1.0
# 1.0
# 0.0
# 0.0
# 0.105891
# 0.298792
# Using different normalization methods
metric_geom = metrics.NormalizedMutualInfo(average_method='geometric')
for yt, yp in zip(y_true, y_pred):
metric_geom.update(yt, yp)
print(metric_geom)
# NormalizedMutualInfo: 0.515804