Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Online ml River Metrics MutualInfo

From Leeroopedia


Knowledge Sources
Domains Online_Learning, Evaluation_Metrics, Clustering
Last Updated 2026-02-08 16:00 GMT

Overview

Mutual information based metrics for measuring agreement between two clusterings or label assignments.

Description

This module provides three entropy-based metrics: MutualInfo (raw mutual information between two clusterings), NormalizedMutualInfo (MI normalized to [0,1] range using generalized mean of entropies), and AdjustedMutualInfo (MI adjusted for chance agreement). These metrics are symmetric, permutation-invariant, and useful for comparing clustering results or measuring label assignment agreement when ground truth may be unknown.

Usage

Use MutualInfo metrics to evaluate clustering algorithms or compare label assignments. MutualInfo measures raw similarity. NormalizedMutualInfo scales results between 0 (no mutual information) and 1 (perfect mutual information). AdjustedMutualInfo corrects for chance agreement and is preferred when you want to account for random agreement between clusterings. All three are symmetric and work without knowing which clustering is "true."

Code Reference

Source Location

Signature

class MutualInfo(metrics.base.MultiClassMetric):
    def __init__(self, cm=None):
        pass

class NormalizedMutualInfo(metrics.base.MultiClassMetric):
    def __init__(self, cm=None, average_method="arithmetic"):
        # average_method: 'min', 'max', 'arithmetic', 'geometric'
        pass

class AdjustedMutualInfo(metrics.base.MultiClassMetric):
    def __init__(self, cm=None, average_method="arithmetic"):
        # average_method: 'min', 'max', 'arithmetic', 'geometric'
        pass

Import

from river import metrics

I/O Contract

Method Parameters Returns Description
update y_true, y_pred None Updates metric with true and predicted cluster labels
get - float Returns mutual information score

Note: These metrics do not support sample weights (works_with_weights returns False).

Usage Examples

from river import metrics

y_true = [1, 1, 2, 2, 3, 3]
y_pred = [1, 1, 1, 2, 2, 2]

# Raw Mutual Information
metric = metrics.MutualInfo()
for yt, yp in zip(y_true, y_pred):
    metric.update(yt, yp)
    print(metric.get())
# 0.0
# 0.0
# 0.0
# 0.215761
# 0.395752
# 0.462098

# Normalized Mutual Information (scales to 0-1)
metric_norm = metrics.NormalizedMutualInfo()
for yt, yp in zip(y_true, y_pred):
    metric_norm.update(yt, yp)
    print(metric_norm.get())
# 1.0
# 1.0
# 0.0
# 0.343711
# 0.458065
# 0.515803

# Adjusted Mutual Information (corrects for chance)
metric_adj = metrics.AdjustedMutualInfo()
for yt, yp in zip(y_true, y_pred):
    metric_adj.update(yt, yp)
    print(metric_adj.get())
# 1.0
# 1.0
# 0.0
# 0.0
# 0.105891
# 0.298792

# Using different normalization methods
metric_geom = metrics.NormalizedMutualInfo(average_method='geometric')
for yt, yp in zip(y_true, y_pred):
    metric_geom.update(yt, yp)

print(metric_geom)
# NormalizedMutualInfo: 0.515804

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment