Implementation:Online ml River Cluster ODAC

Knowledge Sources	Online_ml_River
Domains	Online_Learning, Clustering, Hierarchical_Clustering, Time_Series
Last Updated	2026-02-08 16:00 GMT

Overview

Online Divisive-Agglomerative Clustering (ODAC) continuously maintains a hierarchical cluster structure from evolving time series data streams.

Description

ODAC is a hierarchical clustering algorithm designed for streaming time series data. It uses a distance metric based on Pearson correlation: rnomc(a, b) = sqrt((1 - corr(a, b)) / 2). The algorithm continuously monitors the evolution of cluster diameters and dynamically splits or merges clusters based on statistical tests using the Hoeffding bound.

The split operator triggers when the difference between the largest distance (diameter) and the second largest distance exceeds a confidence threshold. The merge operator checks if a child cluster's diameter is larger than its parent's diameter, again using the Hoeffding bound to ensure statistical significance.

ODAC only monitors leaf clusters for splitting and merging operations, making it efficient for real-time processing. When the structure changes through split or merge operations, the structure_changed flag is set to true, allowing users to track structural evolution.

Usage

Use ODAC when you need to discover and maintain hierarchical cluster structures in streaming time series data, especially when the number of clusters is unknown and may change over time due to concept drift. It's particularly useful for monitoring systems where the relationships between time series evolve dynamically.

Code Reference

Source Location

Repository: Online_ml_River
File: river/cluster/odac.py

Signature

class ODAC(base.Clusterer):
    def __init__(self, confidence_level: float = 0.9, n_min: int = 100, tau: float = 0.1):
        ...

Import

from river import cluster
model = cluster.ODAC()

I/O Contract

Input
Parameter	Type	Description
x	dict	Dictionary of time series observations with feature names as keys

Output
Method	Return Type	Description
learn_one(x)	None	Updates the hierarchical cluster structure
render_ascii(n_decimal_places)	str	Returns ASCII representation of tree structure
draw(max_depth, show_clusters_info, n_decimal_places)	graphviz.Digraph	Returns Graphviz visualization

Parameters
Name	Type	Default	Description
confidence_level	float	0.9	Confidence level for Hoeffding bound (between 0 and 1)
n_min	int	100	Minimum observations before checking for splits/merges
tau	float	0.1	Threshold to force splits and break ties (must be > 0)

Properties
Property	Type	Description
structure_changed	bool	True when structure changed via split or merge
n_clusters	int	Total number of clusters in the hierarchy
n_active_clusters	int	Number of active (leaf) clusters
height	int	Height of the hierarchical tree
summary	dict	Dictionary with n_clusters, n_active_clusters, and height

Usage Examples

from river import cluster
from river.datasets import synth

model = cluster.ODAC(confidence_level=0.9, n_min=100, tau=0.1)

dataset = synth.FriedmanDrift(drift_type='gra', position=(150, 200), seed=42)

for i, (x, _) in enumerate(dataset.take(500)):
    model.learn_one(x)
    if model.structure_changed:
        print(f"Structure changed at observation {i + 1}")

# Display the hierarchical structure
print(model.render_ascii())

# Access properties
print(f"Number of clusters: {model.n_clusters}")
print(f"Number of active clusters: {model.n_active_clusters}")
print(f"Tree height: {model.height}")
print(model.summary)

# Visualize with Graphviz (if installed)
# graph = model.draw(max_depth=3, show_clusters_info=['timeseries_names', 'd1', 'd2'])
# graph.render('odac_tree', format='png')

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment