Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Online ml River Drift KSWIN

From Leeroopedia


Knowledge Sources Domains Last Updated
River River Docs KSWIN - Kolmogorov-Smirnov Windowing Method for Drift Detection Online Machine Learning, Concept Drift Detection, Non-Parametric Testing 2026-02-08 16:00 GMT

Overview

Concrete tool for detecting concept drift using the Kolmogorov-Smirnov two-sample test on a fixed-size sliding window, capable of detecting any type of distributional change without parametric assumptions.

Description

The drift.KSWIN class implements the Kolmogorov-Smirnov Windowing method for concept drift detection. It maintains a sliding window of fixed size (window_size) implemented as a collections.deque. The last stat_size elements form the recent window, and a random sample of stat_size elements is drawn from the remaining older elements to form the reference window. The two-sample KS test from scipy.stats.ks_2samp is applied to these sub-windows. Drift is flagged when the p-value falls below alpha and the KS statistic exceeds 0.1. Upon drift detection, the window is reset to contain only the recent sub-window.

Usage

Import drift.KSWIN when you need a non-parametric drift detector that can identify any type of distributional change (not just mean shifts). It requires scipy as an external dependency.

Code Reference

Source Location

river/drift/kswin.py:L14-L162

Signature

class KSWIN(DriftDetector):
    def __init__(
        self,
        alpha: float = 0.005,
        window_size: int = 100,
        stat_size: int = 30,
        seed: int | None = None,
        window: typing.Iterable | None = None,
    )

Import

from river import drift

Key Parameters

Parameter Type Default Description
alpha float 0.005 Significance level for the KS test. Must be between 0 and 1. Should be set below 0.01 for best results
window_size int 100 Total size of the sliding window
stat_size int 30 Size of the recent sub-window used for the KS test. Must be smaller than window_size
seed int or None None Random seed for reproducibility of the random sampling from the reference window
window Iterable or None None Pre-collected data to initialize the window (avoids cold start)

External Dependencies

  • scipy.stats -- used for the ks_2samp two-sample Kolmogorov-Smirnov test

I/O Contract

Inputs

Method Parameter Type Description
update x float A single numeric value to add to the sliding window

Outputs

Property/Method Return Type Description
drift_detected bool True if drift was detected on the most recent update call
p_value float The p-value from the most recent KS test (0 if insufficient data)
n int Total number of samples processed

Usage Examples

Basic Drift Detection

import random
from river import drift

rng = random.Random(12345)
kswin = drift.KSWIN(alpha=0.0001, seed=42)

# Simulate a data stream with a distribution change at index 1000
data_stream = rng.choices([0, 1], k=1000) + rng.choices(range(4, 8), k=1000)

for i, val in enumerate(data_stream):
    kswin.update(val)
    if kswin.drift_detected:
        print(f"Change detected at index {i}, input value: {val}")
# Change detected at index 1016, input value: 6

Detecting Variance Changes

import random
from river import drift

rng = random.Random(42)
kswin = drift.KSWIN(alpha=0.001, window_size=200, stat_size=50, seed=42)

# Same mean but different variance
data_stream = [rng.gauss(5, 1) for _ in range(1000)] + [rng.gauss(5, 5) for _ in range(1000)]

for i, val in enumerate(data_stream):
    kswin.update(val)
    if kswin.drift_detected:
        print(f"Variance change detected at index {i}")

Pre-Initialized Window

from river import drift

# Start with pre-collected data to avoid cold start
initial_data = [0.5, 0.3, 0.7, 0.4, 0.6] * 20
kswin = drift.KSWIN(alpha=0.005, window_size=100, stat_size=30, window=initial_data)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment