Implementation:Scikit learn contrib Imbalanced learn BalancedRandomForestClassifier

Knowledge Sources	imbalanced-learn imbalanced-learn Docs
Domains	Machine_Learning, Ensemble_Learning, Imbalanced_Learning
Last Updated	2026-02-09 03:00 GMT

Overview

Concrete tool for training balanced random forests on imbalanced data provided by the imbalanced-learn library.

Description

The BalancedRandomForestClassifier extends sklearn.ensemble.RandomForestClassifier by internally under-sampling the majority class for each tree. It uses RandomUnderSampler to create balanced bootstrap samples per tree. Supports all standard RF parameters plus sampling_strategy and replacement.

Usage

Import this class as a drop-in replacement for RandomForestClassifier when dealing with imbalanced datasets.

Code Reference

Source Location

Repository: imbalanced-learn
File: imblearn/ensemble/_forest.py
Lines: L92-823

Signature

class BalancedRandomForestClassifier(RandomForestClassifier):
    def __init__(
        self,
        n_estimators=100,
        *,
        criterion="gini",
        max_depth=None,
        min_samples_split=2,
        min_samples_leaf=1,
        min_weight_fraction_leaf=0.0,
        max_features="sqrt",
        max_leaf_nodes=None,
        min_impurity_decrease=0.0,
        bootstrap=False,
        oob_score=False,
        sampling_strategy="all",
        replacement=True,
        n_jobs=None,
        random_state=None,
        verbose=0,
        warm_start=False,
        class_weight=None,
        ccp_alpha=0.0,
        max_samples=None,
        monotonic_cst=None,
    ):

Import

from imblearn.ensemble import BalancedRandomForestClassifier

I/O Contract

Inputs

Name	Type	Required	Description
X	{array-like, sparse matrix} of shape (n_samples, n_features)	Yes	Training features
y	array-like of shape (n_samples,)	Yes	Target labels
n_estimators	int	No	Number of trees (default: 100)
sampling_strategy	str or dict	No	Balancing strategy (default: 'all')
sample_weight	array-like	No	Per-sample weights

Outputs

Name	Type	Description
fit() returns	self	Fitted classifier with estimators_, samplers_, pipelines_ attributes
predict() returns	ndarray of shape (n_samples,)	Predicted class labels
predict_proba() returns	ndarray of shape (n_samples, n_classes)	Class probabilities

Usage Examples

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from imblearn.ensemble import BalancedRandomForestClassifier

X, y = make_classification(n_classes=2, weights=[0.1, 0.9], n_samples=1000, random_state=0)
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)

brf = BalancedRandomForestClassifier(n_estimators=100, random_state=0)
brf.fit(X_train, y_train)
print(f"Balanced accuracy: {brf.score(X_test, y_test):.3f}")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment