Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Online ml River Metrics Silhouette

From Leeroopedia


Knowledge Sources Domains Last Updated
River River Docs Silhouettes: a graphical aid to the interpretation and validation of cluster analysis (Rousseeuw, 1987) Machine Learning for Data Streams (Bifet et al., 2018) Cluster Evaluation, Streaming Metrics 2026-02-08 16:00 GMT

Overview

Concrete tool for incrementally computing the Silhouette coefficient to evaluate online clustering quality using distances to cluster centroids rather than pairwise point distances.

Description

The metrics.Silhouette class provides an incremental Silhouette coefficient for evaluating clustering results in a streaming context. It maintains two running sums: the cumulative distance from each point to its assigned cluster center and the cumulative distance from each point to its second-closest cluster center. The get() method returns the ratio of these two sums.

Unlike the classical batch Silhouette, this implementation uses centroid-based distances and has a different interpretation: lower values indicate better clustering (bigger_is_better = False). A value close to 0 means excellent cohesion relative to separation, while values approaching 1 or higher indicate poor clustering.

The metric requires the caller to pass the current cluster centers on each update, making it suitable for use with algorithms that expose a centers attribute (such as cluster.KMeans).

Usage

Import metrics.Silhouette when you need an unsupervised streaming evaluation metric for clustering. Call update after each learn/predict cycle with the current cluster centers.

Code Reference

Source Location

river/metrics/silhouette.py:L8-L93

Signature

class Silhouette(metrics.base.ClusteringMetric):
    def __init__(self)

Import

from river import metrics

Methods

Method Signature Description
update update(x: dict, y_pred: int, centers: dict, w=1.0) -> None Updates the running sums with the distances from x to its assigned center and to the second-closest center.
revert revert(x: dict, y_pred: int, centers: dict, w=1.0) -> None Reverts a previous update by subtracting the corresponding distances.
get get() -> float Returns the current Silhouette coefficient (ratio of closest to second-closest center distances). Returns math.inf on zero-division.

I/O Contract

Inputs

Parameter Type Description
x dict Feature dictionary for the current observation.
y_pred int The predicted cluster index for x.
centers dict A dictionary mapping cluster IDs to centroid positions (e.g., model.centers).
w float Optional sample weight (default 1.0).

Outputs

Output Type Description
get() return float The streaming Silhouette coefficient. Lower values indicate better clustering (bigger_is_better = False).

Usage Examples

from river import cluster
from river import stream
from river import metrics

X = [
    [1, 2],
    [1, 4],
    [1, 0],
    [4, 2],
    [4, 4],
    [4, 0],
    [-2, 2],
    [-2, 4],
    [-2, 0]
]

k_means = cluster.KMeans(n_clusters=3, halflife=0.4, sigma=3, seed=0)
metric = metrics.Silhouette()

for x, _ in stream.iter_array(X):
    k_means.learn_one(x)
    y_pred = k_means.predict_one(x)
    metric.update(x, y_pred, k_means.centers)

print(metric)
# Silhouette: 0.32145

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment