Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Online ml River FeatureSelection VarianceThreshold

From Leeroopedia
Revision as of 16:08, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Online_ml_River_FeatureSelection_VarianceThreshold.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Knowledge Sources
Domains Online_Learning, Feature_Selection, Unsupervised_Learning
Last Updated 2026-02-08 16:00 GMT

Overview

Removes low-variance features based on incrementally computed running variance statistics.

Description

VarianceThreshold performs unsupervised feature selection by removing features with variance below a specified threshold. It maintains running variance statistics for each feature using the stats.Var class and filters out features that show insufficient variability. A minimum sample requirement prevents premature filtering before enough data has been observed. Features are evaluated independently without considering the target variable.

Usage

Use this as a simple first-pass filter to remove constant or near-constant features that provide little information. Particularly useful as a preprocessing step to reduce dimensionality before applying supervised selection methods. The threshold parameter can be set based on domain knowledge or experimentation. Effective for removing features with measurement errors stuck at constant values or features with negligible variation in streaming data.

Code Reference

Source Location

Signature

class VarianceThreshold(base.Transformer):
    def __init__(self, threshold=0, min_samples=2)

Import

from river import feature_selection

I/O Contract

Input Output
Dict[str, float] - All features Dict[str, float] - Features above variance threshold

Usage Examples

from river import feature_selection
from river import stream

X = [
    [0, 2, 0, 3],
    [0, 1, 4, 3],
    [0, 1, 1, 3]
]

selector = feature_selection.VarianceThreshold()

for x, _ in stream.iter_array(X):
    selector.learn_one(x)
    print(selector.transform_one(x))
# {0: 0, 1: 2, 2: 0, 3: 3}  # All features kept initially
# {1: 1, 2: 4}               # Feature 0 and 3 removed (low variance)
# {1: 1, 2: 1}               # Same features kept

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment