Implementation:Online ml River Cluster DBSTREAM

Knowledge Sources	Domains	Last Updated
River River Docs Clustering Data Streams Based on Shared Density between Micro-Clusters (Hahsler and Bolanos, 2016)	Online Clustering, Density-Based Clustering	2026-02-08 16:00 GMT

Overview

Concrete tool for performing DBSTREAM density-based clustering on evolving data streams, maintaining micro-clusters with a shared density graph and producing macro-clusters via connected components.

Description

The cluster.DBSTREAM class implements the DBSTREAM algorithm for streaming density-based clustering. It maintains a set of micro-clusters, each defined by a center position, a weight (which fades over time), and a last-update timestamp. A shared density graph tracks the co-occurrence of micro-cluster activations. On prediction, the algorithm reclusters using a DBSCAN variant on the shared density graph to produce macro-clusters.

Key internal state includes micro_clusters (the set of active micro-clusters), a shared density matrix s, and timestamp tracking for both micro-clusters and shared densities. The cleanup process periodically removes weak micro-clusters and weak shared density entries.

Usage

Import cluster.DBSTREAM when you need online density-based clustering that discovers clusters of arbitrary shape and automatically determines the number of clusters. It is suitable for evolving data streams where clusters may appear, disappear, or change shape over time.

Code Reference

Source Location

river/cluster/dbstream.py:L11-L443

Signature

class DBSTREAM(base.Clusterer):
    def __init__(
        self,
        clustering_threshold: float = 1.0,
        fading_factor: float = 0.01,
        cleanup_interval: float = 2,
        intersection_factor: float = 0.3,
        minimum_weight: float = 1.0
    )

Import

from river import cluster

Key Parameters

Parameter	Default	Description
clustering_threshold	1.0	Radius around each micro-cluster center; a point within this distance joins the micro-cluster.
fading_factor	0.01	Controls the exponential weight decay rate. Must be nonzero.
cleanup_interval	2	Time steps between consecutive cleanup passes that remove weak micro-clusters.
intersection_factor	0.3	Threshold for shared density; determines whether micro-clusters are connected in the density graph.
minimum_weight	1.0	Minimum weight for a micro-cluster to be considered "strong" during reclustering.

Methods

Method	Signature	Description
learn_one	`learn_one(x: dict, w=None) -> None`	Updates micro-clusters with observation x; triggers cleanup if at the scheduled interval.
predict_one	`predict_one(x: dict, w=None) -> int`	Triggers reclustering if needed and returns the macro-cluster assignment for x.

Key Attributes

Attribute	Type	Description
n_clusters	`int`	Number of macro-clusters generated after reclustering.
clusters	`dict[int, DBSTREAMMicroCluster]`	Final macro-clusters (merged micro-clusters with same label).
centers	`dict`	Centers of the final macro-clusters.
micro_clusters	`dict[int, DBSTREAMMicroCluster]`	Current set of micro-clusters maintained by the online phase.

I/O Contract

Inputs

Parameter	Type	Description
x	`dict`	A dictionary mapping feature names to numeric values representing one observation.

Outputs

Output	Type	Description
predict_one return	`int`	The macro-cluster index assigned to the observation.

Usage Examples

from river import cluster
from river import stream

X = [
    [1, 0.5], [1, 0.625], [1, 0.75], [1, 1.125], [1, 1.5], [1, 1.75],
    [4, 1.5], [4, 2.25], [4, 2.5], [4, 3], [4, 3.25], [4, 3.5]
]

dbstream = cluster.DBSTREAM(
    clustering_threshold=1.5,
    fading_factor=0.05,
    cleanup_interval=4,
    intersection_factor=0.5,
    minimum_weight=1
)

for x, _ in stream.iter_array(X):
    dbstream.learn_one(x)

dbstream.predict_one({0: 1, 1: 2})
# 0

dbstream.predict_one({0: 5, 1: 2})
# 1

dbstream.n_clusters
# 2

Related Pages

Principle:Online_ml_River_DBSTREAM_Clustering

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment