Implementation:Online ml River Metrics RandIndex
| Knowledge Sources | |
|---|---|
| Domains | Online_Learning, Evaluation_Metrics, Clustering |
| Last Updated | 2026-02-08 16:00 GMT |
Overview
Rand Index and Adjusted Rand Index metrics for measuring clustering similarity.
Description
This module provides two related metrics. Rand computes the Rand Index, measuring similarity between two clusterings as (a+b)/C(n,2) where 'a' is pairs in same clusters in both, 'b' is pairs in different clusters in both, and C(n,2) is total pairs. AdjustedRand corrects for chance agreement using the permutation model, returning values from -1 (worse than random) through 0 (random) to 1 (perfect agreement). Both metrics are symmetric and permutation-invariant.
Usage
Use Rand Index to measure raw agreement between two clusterings, useful when comparing clustering algorithms or validating results against ground truth. Use Adjusted Rand Index when you want to correct for chance agreement, making it more suitable for comparing clustering quality across different numbers of clusters or dataset sizes. AdjustedRand is preferred in most scenarios as it accounts for random agreement.
Code Reference
Source Location
- Repository: Online_ml_River
- File: river/metrics/rand.py
Signature
class Rand(metrics.base.MultiClassMetric):
def __init__(self, cm=None):
pass
class AdjustedRand(metrics.base.MultiClassMetric):
def __init__(self, cm=None):
pass
Import
from river import metrics
I/O Contract
| Method | Parameters | Returns | Description |
|---|---|---|---|
| update | y_true, y_pred | None | Updates metric with true and predicted cluster labels |
| get | - | float | Returns Rand or Adjusted Rand Index |
Note: These metrics do not support sample weights (works_with_weights returns False).
Usage Examples
from river import metrics
y_true = [0, 0, 0, 1, 1, 1]
y_pred = [0, 0, 1, 1, 2, 2]
# Rand Index (raw agreement)
metric_rand = metrics.Rand()
for yt, yp in zip(y_true, y_pred):
metric_rand.update(yt, yp)
print(metric_rand)
# Rand: 0.666667
# Interpretation: 66.67% of pairs are classified consistently
# (both in same cluster or both in different clusters)
# Adjusted Rand Index (corrected for chance)
metric_adj = metrics.AdjustedRand()
for yt, yp in zip(y_true, y_pred):
metric_adj.update(yt, yp)
print(metric_adj.get())
# 1.0
# 1.0
# 0.0
# 0.0
# 0.09090909090909091
# 0.24242424242424243
print(metric_adj)
# AdjustedRand: 0.242424
# Interpretation of Adjusted Rand:
# 1.0: Perfect agreement
# 0.0: Agreement by chance alone
# Negative: Worse than random
# 0.24: Some agreement beyond chance, but not strong
# Compare perfect clustering
y_pred_perfect = [0, 0, 0, 1, 1, 1]
metric_perfect = metrics.AdjustedRand()
for yt, yp in zip(y_true, y_pred_perfect):
metric_perfect.update(yt, yp)
print(f"Perfect clustering Adjusted Rand: {metric_perfect.get()}")
# AdjustedRand: 1.0