Implementation:Online ml River Metrics RandIndex

Knowledge Sources	Online_ml_River
Domains	Online_Learning, Evaluation_Metrics, Clustering
Last Updated	2026-02-08 16:00 GMT

Overview

Rand Index and Adjusted Rand Index metrics for measuring clustering similarity.

Description

This module provides two related metrics. Rand computes the Rand Index, measuring similarity between two clusterings as (a+b)/C(n,2) where 'a' is pairs in same clusters in both, 'b' is pairs in different clusters in both, and C(n,2) is total pairs. AdjustedRand corrects for chance agreement using the permutation model, returning values from -1 (worse than random) through 0 (random) to 1 (perfect agreement). Both metrics are symmetric and permutation-invariant.

Usage

Use Rand Index to measure raw agreement between two clusterings, useful when comparing clustering algorithms or validating results against ground truth. Use Adjusted Rand Index when you want to correct for chance agreement, making it more suitable for comparing clustering quality across different numbers of clusters or dataset sizes. AdjustedRand is preferred in most scenarios as it accounts for random agreement.

Code Reference

Source Location

Repository: Online_ml_River
File: river/metrics/rand.py

Signature

class Rand(metrics.base.MultiClassMetric):
    def __init__(self, cm=None):
        pass

class AdjustedRand(metrics.base.MultiClassMetric):
    def __init__(self, cm=None):
        pass

Import

from river import metrics

I/O Contract

Method	Parameters	Returns	Description
update	y_true, y_pred	None	Updates metric with true and predicted cluster labels
get	-	float	Returns Rand or Adjusted Rand Index

Note: These metrics do not support sample weights (works_with_weights returns False).

Usage Examples

from river import metrics

y_true = [0, 0, 0, 1, 1, 1]
y_pred = [0, 0, 1, 1, 2, 2]

# Rand Index (raw agreement)
metric_rand = metrics.Rand()

for yt, yp in zip(y_true, y_pred):
    metric_rand.update(yt, yp)

print(metric_rand)
# Rand: 0.666667

# Interpretation: 66.67% of pairs are classified consistently
# (both in same cluster or both in different clusters)

# Adjusted Rand Index (corrected for chance)
metric_adj = metrics.AdjustedRand()

for yt, yp in zip(y_true, y_pred):
    metric_adj.update(yt, yp)
    print(metric_adj.get())
# 1.0
# 1.0
# 0.0
# 0.0
# 0.09090909090909091
# 0.24242424242424243

print(metric_adj)
# AdjustedRand: 0.242424

# Interpretation of Adjusted Rand:
# 1.0: Perfect agreement
# 0.0: Agreement by chance alone
# Negative: Worse than random
# 0.24: Some agreement beyond chance, but not strong

# Compare perfect clustering
y_pred_perfect = [0, 0, 0, 1, 1, 1]
metric_perfect = metrics.AdjustedRand()

for yt, yp in zip(y_true, y_pred_perfect):
    metric_perfect.update(yt, yp)

print(f"Perfect clustering Adjusted Rand: {metric_perfect.get()}")
# AdjustedRand: 1.0

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment