Implementation:Scikit learn Scikit learn IsolationForest

Knowledge Sources	Scikit_learn Scikit-learn Docs
Domains	Machine Learning, Anomaly Detection, Ensemble Methods
Last Updated	2026-02-08 15:00 GMT

Overview

Concrete implementation of the Isolation Forest anomaly detection algorithm provided by scikit-learn.

Description

The IsolationForest class implements the Isolation Forest algorithm for anomaly detection. It isolates observations by randomly selecting features and split values, creating random trees. Anomalies have shorter average path lengths because they are easier to isolate. The algorithm builds on BaseBagging with ExtraTreeRegressor as the base estimator and supports parallel tree depth computation.

Usage

Use Isolation Forest for unsupervised anomaly detection when you need to identify outliers in datasets. It works well with high-dimensional data and does not require labeled anomaly data for training.

Code Reference

Source Location

Repository: scikit-learn
File: sklearn/ensemble/_iforest.py

Signature

class IsolationForest(OutlierMixin, BaseBagging):
    def __init__(
        self,
        *,
        n_estimators=100,
        max_samples="auto",
        contamination="auto",
        max_features=1.0,
        bootstrap=False,
        n_jobs=None,
        random_state=None,
        verbose=0,
        warm_start=False,
    ):
        ...

    def fit(self, X, y=None, sample_weight=None):
        ...

    def predict(self, X):
        ...

    def decision_function(self, X):
        ...

    def score_samples(self, X):
        ...

Import

from sklearn.ensemble import IsolationForest

I/O Contract

Inputs

Name	Type	Required	Description
X	array-like of shape (n_samples, n_features)	Yes	Training input samples
n_estimators	int	No	Number of isolation trees (default: 100)
max_samples	int, float, or "auto"	No	Number of samples to draw for each tree
contamination	float or "auto"	No	Expected proportion of anomalies in the dataset
sample_weight	array-like of shape (n_samples,)	No	Per-sample weights

Outputs

Name	Type	Description
predictions	ndarray of shape (n_samples,)	1 for inliers, -1 for outliers
scores	ndarray of shape (n_samples,)	Anomaly scores (lower is more anomalous)

Usage Examples

Basic Usage

import numpy as np
from sklearn.ensemble import IsolationForest

# Generate data with outliers
rng = np.random.RandomState(42)
X_normal = rng.randn(100, 2)
X_outliers = rng.uniform(low=-6, high=6, size=(10, 2))
X = np.vstack([X_normal, X_outliers])

clf = IsolationForest(random_state=42, contamination=0.1)
clf.fit(X)
predictions = clf.predict(X)
print(f"Detected outliers: {(predictions == -1).sum()}")

Related Pages

Principle:Scikit_learn_Scikit_learn_Anomaly_Detection

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment