Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Online ml River Metrics VBeta

From Leeroopedia


Knowledge Sources
Domains Online_Learning, Evaluation_Metrics, Clustering
Last Updated 2026-02-08 16:00 GMT

Overview

V-Measure family of entropy-based cluster evaluation metrics including Homogeneity, Completeness, and VBeta.

Description

This module provides three related metrics. Homogeneity measures if each cluster contains only members of a single class (cluster purity). Completeness measures if all members of a class are assigned to the same cluster (class coverage). VBeta (V-Measure) combines both as a weighted harmonic mean: V_β = ((1+β)×h×c)/(β×h+c), where h is homogeneity, c is completeness, and beta controls their relative importance. All metrics are entropy-based, symmetric, and permutation-invariant.

Usage

Use Homogeneity when cluster purity is most important (each cluster should be homogeneous). Use Completeness when class coverage matters most (all members of a class should cluster together). Use VBeta to balance both aspects, with beta=1 giving equal weight. These metrics are valuable for evaluating clustering when you have ground truth labels but don't know the cluster-to-class mapping, as they're insensitive to label permutations.

Code Reference

Source Location

Signature

class Homogeneity(metrics.base.MultiClassMetric):
    def __init__(self, cm=None):
        pass

class Completeness(metrics.base.MultiClassMetric):
    def __init__(self, cm=None):
        pass

class VBeta(metrics.base.MultiClassMetric):
    def __init__(self, beta: float = 1.0, cm=None):
        pass

Import

from river import metrics

I/O Contract

Method Parameters Returns Description
update y_true, y_pred None Updates metric with true and predicted cluster labels
get - float Returns metric score (0.0 to 1.0)

Note: These metrics do not support sample weights (works_with_weights returns False).

Usage Examples

from river import metrics

y_true = [1, 1, 2, 2, 3, 3]
y_pred = [1, 1, 1, 2, 2, 2]

# Homogeneity (cluster purity)
metric_h = metrics.Homogeneity()
for yt, yp in zip(y_true, y_pred):
    metric_h.update(yt, yp)
    print(metric_h.get())
# 1.0
# 1.0
# 0.0
# 0.311278
# 0.37515
# 0.42062

print(metric_h)
# Homogeneity: 42.06%
# Moderate homogeneity: clusters have mixed classes

# Completeness (class coverage)
metric_c = metrics.Completeness()
for yt, yp in zip(y_true, y_pred):
    metric_c.update(yt, yp)
    print(metric_c.get())
# 1.0
# 1.0
# 1.0
# 0.3836885465963443
# 0.5880325916843805
# 0.6666666666666667

print(metric_c)
# Completeness: 66.67%
# Better completeness: classes are more consolidated

# V-Measure (balanced combination)
metric_v = metrics.VBeta(beta=1.0)
for yt, yp in zip(y_true, y_pred):
    metric_v.update(yt, yp)
    print(metric_v.get())
# 1.0
# 1.0
# 0.0
# 0.3437110184854507
# 0.4580652856440158
# 0.5158037429793888

print(metric_v)
# VBeta: 51.58%
# V-Measure balances homogeneity and completeness

# Adjust beta to weight homogeneity or completeness
metric_v2 = metrics.VBeta(beta=2.0)  # Weight completeness more
for yt, yp in zip(y_true, y_pred):
    metric_v2.update(yt, yp)

print(f"VBeta (beta=2): {metric_v2.get():.2%}")
# Higher beta emphasizes completeness over homogeneity

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment