Implementation:Online ml River Multioutput Chain
| Knowledge Sources | |
|---|---|
| Domains | Online_Learning, Multi_Label_Classification, Multi_Target_Regression |
| Last Updated | 2026-02-08 16:00 GMT |
Overview
Classifier and Regressor Chains arrange models in a sequence where predictions from earlier outputs become features for later outputs, capturing dependencies between targets.
Description
Chain models create one model instance per output label/target. The models are arranged in a specified order. When making predictions or training, the chain processes outputs sequentially: the first model predicts its output, this prediction is added as a feature, then the second model makes its prediction using the augmented features, and so on. This allows later models to condition on earlier predictions, capturing dependencies between outputs. For multi-label classification, predicted probabilities for each class are added as features. The order can be specified explicitly or inferred from the target dictionary key order. The implementation handles missing outputs gracefully during training.
Usage
Use Classifier/Regressor Chains when outputs have dependencies that can inform predictions (e.g., multi-label classification where label correlations exist, or multi-target regression with related targets). Specify the order parameter to control the prediction sequence - putting more important or easier-to-predict outputs first often works well. For multi-label problems, chains typically outperform independent binary classifiers when label correlations are strong. The approach adds minimal computational overhead compared to training independent models.
Code Reference
Source Location
- Repository: Online_ml_River
- File: river/multioutput/chain.py
Signature
class ClassifierChain(
model: base.Classifier,
order: list | None = None
)
class RegressorChain(
model: base.Regressor,
order: list | None = None
)
class ProbabilisticClassifierChain(
model: base.Classifier
)
class MonteCarloClassifierChain(
model: base.Classifier,
m: int = 10,
seed: int | None = None
)
Import
from river import multioutput
I/O Contract
| Parameter | Type | Description |
|---|---|---|
| x | dict | Feature dictionary |
| y | dict | Dictionary mapping output names to target values |
| Method | Return Type | Description |
|---|---|---|
| predict_one(x) | dict | Predictions for all outputs |
| predict_proba_one(x) | dict | Probabilities per output (classifier) |
| learn_one(x, y) | None | Updates chain with dependencies |
Usage Examples
from river import feature_selection
from river import linear_model
from river import metrics
from river import multioutput
from river import preprocessing
from river import stream
from sklearn import datasets
# Multi-label classification
dataset = stream.iter_sklearn_dataset(
dataset=datasets.fetch_openml('yeast', version=4, parser='auto', as_frame=False),
shuffle=True,
seed=42
)
model = feature_selection.VarianceThreshold(threshold=0.01)
model |= preprocessing.StandardScaler()
model |= multioutput.ClassifierChain(
model=linear_model.LogisticRegression(),
order=list(range(14))
)
metric = metrics.multioutput.MicroAverage(metrics.Jaccard())
for x, y in dataset:
y = {i: yi == 'TRUE' for i, yi in y.items()}
y_pred = model.predict_one(x)
metric.update(y, y_pred)
model.learn_one(x, y)
print(metric) # MicroAverage(Jaccard): 41.81%