Implementation:Online ml River Metrics AdjustedRand

Knowledge Sources	Domains	Last Updated
River River Docs Objective criteria for the evaluation of clustering methods (Rand, 1971)	Cluster Evaluation, Streaming Metrics	2026-02-08 16:00 GMT

Overview

Concrete tool for incrementally computing the Adjusted Rand Index to compare predicted cluster assignments against ground truth labels in a streaming fashion using an updatable contingency table.

Description

The metrics.AdjustedRand class computes the Adjusted Rand Index (ARI) incrementally. It inherits from metrics.base.MultiClassMetric and uses River's confusion matrix infrastructure to maintain a contingency table that is updated with each (y_true, y_pred) pair. When get() is called, the pair confusion matrix is derived from the contingency table, and the ARI is computed from the true positives, true negatives, false positives, and false negatives at the pair level.

The metric returns 1.0 for perfect agreement, 0.0 for chance-level agreement, and negative values for agreement worse than chance. It returns 1.0 when there are no pairs to compare (zero denominator).

Usage

Import metrics.AdjustedRand when you have ground truth labels available and want to evaluate online clustering quality with a chance-corrected metric. Call update(y_true, y_pred) after each prediction.

Code Reference

Source Location

river/metrics/rand.py:L117-L195

Signature

class AdjustedRand(metrics.base.MultiClassMetric):
    def __init__(self, cm=None)

Import

from river import metrics

Key Parameters

Parameter	Default	Description
cm	None	Optional shared confusion matrix. If provided, allows sharing the same confusion matrix between multiple metrics to reduce storage and computation time.

Methods

Method	Signature	Description
update	`update(y_true, y_pred, w=1.0) -> None`	Updates the internal contingency table with one new observation's true and predicted labels.
get	`get() -> float`	Computes and returns the current Adjusted Rand Index from the contingency table. Returns 1.0 on zero-division.

I/O Contract

Inputs

Parameter	Type	Description
y_true	any hashable	The ground truth cluster label for the observation.
y_pred	any hashable	The predicted cluster label for the observation.
w	`float`	Optional sample weight (default 1.0). Note: `works_with_weights = False` for this metric.

Outputs

Output	Type	Description
get() return	`float`	The Adjusted Rand Index. 1.0 = perfect agreement; 0.0 = chance agreement; negative = worse than chance.

Usage Examples

from river import metrics

y_true = [0, 0, 0, 1, 1, 1]
y_pred = [0, 0, 1, 1, 2, 2]

metric = metrics.AdjustedRand()

for yt, yp in zip(y_true, y_pred):
    metric.update(yt, yp)
    print(metric.get())
# 1.0
# 1.0
# 0.0
# 0.0
# 0.09090909090909091
# 0.24242424242424243

print(metric)
# AdjustedRand: 0.242424

Using with an online clustering model:

from river import cluster, stream, metrics

model = cluster.KMeans(n_clusters=2, halflife=0.5, seed=42)
metric = metrics.AdjustedRand()

# Labeled data for evaluation
data = [
    ({'x': 1, 'y': 2}, 0),
    ({'x': 1.5, 'y': 1.8}, 0),
    ({'x': 5, 'y': 8}, 1),
    ({'x': 8, 'y': 8}, 1),
]

for x, y_true in data:
    model.learn_one(x)
    y_pred = model.predict_one(x)
    metric.update(y_true, y_pred)

print(f'ARI: {metric.get():.4f}')

Related Pages

Principle:Online_ml_River_Streaming_Adjusted_Rand

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment