Implementation:Online ml River Tree iSOUPTreeRegressor
| Knowledge Sources | |
|---|---|
| Domains | Online_Learning, Decision_Trees, Multi_Target_Regression |
| Last Updated | 2026-02-08 16:00 GMT |
Overview
Incremental Structured Output Prediction Tree (iSOUP-Tree) is a multi-target regression tree that simultaneously predicts multiple continuous outputs. It extends Hoeffding Tree Regressor to handle multiple correlated target variables using intra-cluster variance reduction.
Description
iSOUP-Tree addresses multi-target regression by treating the target space as a cluster and minimizing the intra-cluster variance when selecting splits. Instead of treating each target independently, it considers the joint variance across all targets, allowing the tree to capture correlations between outputs.
Key features:
- Simultaneous prediction of multiple continuous targets
- Intra-cluster variance reduction split criterion
- Three leaf prediction strategies adapted for multi-target scenarios
- Can use different regression models for each target
- Efficient incremental learning for structured outputs
The split criterion minimizes the sum of variances across all targets after splitting, weighted by the number of samples in each branch. This encourages splits that create homogeneous regions in the multi-dimensional target space.
Usage
import numbers
from river import compose
from river import datasets
from river import evaluate
from river import linear_model
from river import metrics
from river import preprocessing
from river import tree
dataset = datasets.SolarFlare()
num = compose.SelectType(numbers.Number) | preprocessing.MinMaxScaler()
cat = compose.SelectType(str) | preprocessing.OneHotEncoder()
model = tree.iSOUPTreeRegressor(
grace_period=100,
leaf_prediction='model',
leaf_model={
'c-class-flares': linear_model.LinearRegression(l2=0.02),
'm-class-flares': linear_model.PARegressor(),
'x-class-flares': linear_model.LinearRegression(l2=0.1)
}
)
pipeline = (num + cat) | model
metric = metrics.multioutput.MicroAverage(metrics.MAE())
evaluate.progressive_val_score(dataset, pipeline, metric)
# MicroAverage(MAE): 0.426177
Code Reference
Source Location:
/tmp/kapso_repo_178qi9vb/river/tree/isoup_tree_regressor.py
Signature:
class iSOUPTreeRegressor(tree.HoeffdingTreeRegressor, base.MultiTargetRegressor):
def __init__(
self,
grace_period: int = 200,
max_depth: int | None = None,
delta: float = 1e-7,
tau: float = 0.05,
leaf_prediction: str = "adaptive",
leaf_model: base.Regressor | dict | None = None,
model_selector_decay: float = 0.95,
nominal_attributes: list | None = None,
splitter: Splitter | None = None,
min_samples_split: int = 5,
binary_split: bool = False,
max_size: float = 500.0,
memory_estimate_period: int = 1000000,
stop_mem_management: bool = False,
remove_poor_attrs: bool = False,
merit_preprune: bool = True,
)
Import:
from river.tree import iSOUPTreeRegressor
I/O Contract
Input:
- x (dict): Feature dictionary with attribute names as keys
- y (dict): Dictionary mapping target names to values (e.g., {'target1': 2.5, 'target2': 1.3})
- w (float, optional): Sample weight (default: 1.0)
Output:
- predict_one(x): Dictionary mapping target names to predicted values
Key Parameters
- grace_period (int): Number of instances between split attempts
- leaf_prediction (str): Prediction strategy ('mean', 'model', 'adaptive')
- leaf_model (Regressor | dict): Models for targets. Can be:
* Single regressor (replicated to all targets) * Dictionary mapping target names to regressors * None (uses LinearRegression for all targets)
- model_selector_decay (float): Exponential decay for adaptive strategy
- delta (float): Significance level for Hoeffding bound
- tau (float): Tie-breaking threshold
- splitter (Splitter): Attribute observer (default: TEBSTSplitter)
- min_samples_split (int): Minimum samples per branch
Implementation Details
Key Methods:
- learn_one(x, y, w=1.0): Train on one multi-target instance
- predict_one(x): Predict all targets
- _new_leaf(initial_stats, parent): Create multi-target leaf
- _new_split_criterion(): Create IntraClusterVarianceReductionSplitCriterion
Node Types:
- LeafMeanMultiTarget: Predicts mean for each target
- LeafModelMultiTarget: Uses separate model for each target
- LeafAdaptiveMultiTarget: Adaptively chooses between mean and model per target
Split Criterion:
IntraClusterVarianceReductionSplitCriterion computes:
- Pre-split: Sum of variances across all targets
- Post-split: Weighted sum of variances in each branch
- Merit: (pre-split variance) - (post-split variance)
For k targets: merit = Σᵢ₌₁ᵏ var(yᵢ) - Σⱼ [nⱼ/n * Σᵢ₌₁ᵏ var(yᵢ|branch j)]
Multi-Target Leaf Models
Dictionary Specification:
When leaf_model is a dictionary:
- Keys: target variable names
- Values: Regressor instances for each target
- If a target is missing, a copy of the first model is used
Example:
leaf_model = {
'temperature': linear_model.LinearRegression(l2=0.1),
'humidity': linear_model.PARegressor(),
'pressure': linear_model.LinearRegression(l2=0.05)
}
Model Inheritance:
When a node splits, child leaves inherit:
- Deep copies of parent's models
- Adaptive leaf statistics (fmse_mean, fmse_model)
- This ensures continuity in model adaptation
Adaptive Strategy
For each target independently: 1. Maintain exponentially smoothed squared errors for mean and model predictions 2. Before prediction, compare smoothed errors 3. Use the predictor with lower error for that target 4. Different targets may use different strategies at the same leaf
Target Discovery
The tree dynamically discovers targets:
- Maintains a set of observed target names
- Updates the set each time a new target appears
- Handles scenarios where:
* Not all samples contain all targets * New targets emerge over time * Target names are strings or other hashable types
Comparison with Standard HTR
| Feature | iSOUP-Tree | Hoeffding Tree Regressor |
|---|---|---|
| Output type | Multiple targets | Single target |
| Split criterion | Intra-cluster variance reduction | Variance reduction |
| Leaf models | One per target | Single model |
| Target correlation | Captured | Ignored |
| Prediction | Dict of values | Single value |
Related Pages
- Online_ml_River_Tree_HoeffdingTreeRegressor
- Online_ml_River_Tree_Base_Nodes
- Online_ml_River_Metrics_Multioutput
- Online_ml_River_Tree_Splitter
References
Aljaž Osojnik, Panče Panov, and Sašo Džeroski. "Tree-based methods for online multi-target regression." Journal of Intelligent Information Systems 50.2 (2018): 315-339.