Implementation:Online ml River Forest AMFClassifier
| Knowledge Sources | |
|---|---|
| Domains | Online_Learning, Random_Forests, Ensemble_Methods |
| Last Updated | 2026-02-08 16:00 GMT |
Overview
Aggregated Mondrian Forest (AMF) is a truly online ensemble classifier that builds random decision trees using Mondrian processes for adaptive learning from data streams.
Description
AMF creates an ensemble of Mondrian trees that make predictions by aggregating along all subtrees on the path from root to leaf, not just the leaf prediction. Each tree node predicts using a regularized class distribution (Jeffreys prior with dirichlet parameter). The aggregation uses exponential weights with context tree weighting algorithm, computing exact Bayes-optimal predictions over all possible tree structures. Each tree has a drift detection mechanism built-in, and predictions are averaged across all trees in the forest. The algorithm is truly online, requiring only a single pass through the data and can produce predictions at any time.
Usage
Use AMF for classification on data streams where concept drift may occur and you need truly incremental learning without batch processing. It is particularly effective for multi-class problems and handles both numerical and categorical features. The dirichlet parameter should be set around 1/n_classes (default 0.5 works well for binary classification). Set use_aggregation=True for best performance, which enables the sophisticated context tree weighting mechanism.
Code Reference
Source Location
- Repository: Online_ml_River
- File: river/forest/aggregated_mondrian_forest.py
Signature
class AMFClassifier(
n_estimators: int = 10,
step: float = 1.0,
use_aggregation: bool = True,
dirichlet: float = 0.5,
split_pure: bool = False,
seed: int | None = None,
)
Import
from river import forest
I/O Contract
| Parameter | Type | Description |
|---|---|---|
| x | dict | Feature dictionary with feature names as keys |
| y | Any | Class label (any hashable type) |
| Method | Return Type | Description |
|---|---|---|
| predict_one(x) | Any | Predicted class label |
| predict_proba_one(x) | dict | Probability distribution over all seen classes |
| learn_one(x, y) | None | Updates all trees in the forest |
Usage Examples
from river import datasets
from river import evaluate
from river import forest
from river import metrics
dataset = datasets.Bananas().take(500)
model = forest.AMFClassifier(
n_estimators=10,
use_aggregation=True,
dirichlet=0.5,
seed=1
)
metric = metrics.Accuracy()
result = evaluate.progressive_val_score(dataset, model, metric)
print(result) # Accuracy: 85.37%