Principle:Online ml River Half Space Trees Anomaly Detection

Knowledge Sources	River River Docs Fast Anomaly Detection for Streaming Data
Domains	Online Machine Learning, Anomaly Detection, Ensemble Methods
Last Updated	2026-02-08 16:00 GMT

Overview

Ensemble-based anomaly detection algorithm that uses randomly partitioned half-space trees to score observations based on mass estimation, designed specifically for streaming data.

Description

Half-Space Trees (HST) are an online variant of isolation forests, designed for fast anomaly detection on streaming data. The core idea is that anomalous points tend to be isolated in low-density regions of the feature space, and this isolation can be detected by measuring the "mass" (number of observations) that falls into randomly defined partitions.

The algorithm builds an ensemble of random binary trees. Each tree partitions the feature space using random axis-aligned splits. For each tree, every internal node defines a random split on a randomly chosen feature at a randomly chosen value within that feature's range. Each leaf and internal node maintains a mass counter that tracks how many observations have passed through it.

The algorithm uses a dual-window mechanism:

Working window (l_mass): Counts observations from the current window.
Reference window (r_mass): Stores the counts from the previous window.

After window_size observations, the working window is "pivoted" to become the reference window, and the working window counters are reset to zero. Anomaly scoring uses the reference window masses.

The anomaly score for an observation is computed by traversing each tree from root to the node where the reference mass falls below a threshold. Observations that land in regions with low reference mass (shallow traversal) receive high anomaly scores, indicating they are in sparse regions. The final score is normalized to [0, 1] where high scores indicate anomalies.

Half-Space Trees work well when anomalies are spread out in the feature space. They do not perform well when anomalies are packed together in clusters within a single window.

Usage

Use Half-Space Trees when:

You need online, one-pass anomaly detection on a data stream
Anomalies are expected to be isolated points in sparse regions
Features are bounded (or can be normalized to [0, 1] via MinMaxScaler)
You need a lightweight ensemble method with tunable complexity (number of trees, tree height)
You want anomaly scores normalized to [0, 1]

Theoretical Basis

Reference: Tan, S.C., Ting, K.M. and Liu, T.F., 2011. "Fast Anomaly Detection for Streaming Data." Twenty-Second International Joint Conference on Artificial Intelligence (IJCAI).

Tree Construction:

Each tree is a complete binary tree of a specified height h. Each internal node stores:

A randomly chosen feature dimension
A randomly chosen split value within the feature's range
Two mass counters: r_mass (reference) and l_mass (working)

BUILD_TREE(limits, height, rng):
    if height == 0:
        return Leaf(r_mass=0, l_mass=0)
    feature = rng.choice(features)
    split = rng.uniform(limits[feature].min, limits[feature].max)
    left = BUILD_TREE(limits, height-1, rng)
    right = BUILD_TREE(limits, height-1, rng)
    return Node(feature, split, left, right, r_mass=0, l_mass=0)

Learning (updating mass counters):

LEARN_ONE(x):
    for each tree t in ensemble:
        for each node on the path from root to leaf for x:
            node.l_mass += 1
    counter += 1
    if counter == window_size:
        PIVOT_WINDOWS()
        counter = 0

PIVOT_WINDOWS():
    for each tree t:
        for each node in t:
            node.r_mass = node.l_mass
            node.l_mass = 0

Scoring:

SCORE_ONE(x):
    if in first window:
        return 0       # no reference data yet
    score = 0
    for each tree t:
        for depth, node in walk(t, x):
            score += node.r_mass * 2^depth
            if node.r_mass < 0.1 * window_size:
                break   # stop at sparse region
    score = score / max_possible_score
    return 1 - score   # invert: high = anomalous

Key properties:

Score range: [0, 1], where 1 = highly anomalous, 0 = normal
Size limit threshold: 0.1 * window_size (magic constant from the original paper)
Maximum possible score: n_trees * window_size * (2^(h+1) - 1)
Time complexity: O(n_trees * 2^height) per learn/score call
Space complexity: O(n_trees * 2^height) nodes total

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment