Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Scikit learn contrib Imbalanced learn make imbalance

From Leeroopedia


Knowledge Sources
Domains Machine_Learning, Data_Preprocessing, Testing
Last Updated 2026-02-09 03:00 GMT

Overview

Concrete tool for creating imbalanced datasets from balanced data provided by the imbalanced-learn library.

Description

The make_imbalance function takes an existing dataset and reduces class counts according to a sampling_strategy dict or callable. Internally it uses RandomUnderSampler to perform the reduction.

Usage

Import this function to create controlled imbalanced datasets for testing, benchmarking, or demonstration purposes.

Code Reference

Source Location

  • Repository: imbalanced-learn
  • File: imblearn/datasets/_imbalance.py
  • Lines: L27-118

Signature

def make_imbalance(
    X, y, *, sampling_strategy=None, random_state=None, verbose=False, **kwargs
):
    """
    Args:
        X: {array-like, dataframe} of shape (n_samples, n_features) - Data.
        y: array-like of shape (n_samples,) - Labels.
        sampling_strategy: dict or callable - Target class counts.
        random_state: int, RandomState, or None - Seed.
        verbose: bool - Print distribution info (default: False).
    Returns:
        X_resampled, y_resampled - Imbalanced dataset.
    """

Import

from imblearn.datasets import make_imbalance

I/O Contract

Inputs

Name Type Required Description
X {array-like, dataframe} of shape (n_samples, n_features) Yes Data matrix
y array-like of shape (n_samples,) Yes Target labels
sampling_strategy dict or callable Yes Target counts per class
random_state int, RandomState, or None No Random seed

Outputs

Name Type Description
X_resampled {ndarray, dataframe} of shape (n_samples_new, n_features) Imbalanced data
y_resampled ndarray of shape (n_samples_new,) Imbalanced labels

Usage Examples

from collections import Counter
from sklearn.datasets import load_iris
from imblearn.datasets import make_imbalance

data = load_iris()
X, y = data.data, data.target
print(f"Before: {Counter(y)}")

X_res, y_res = make_imbalance(
    X, y,
    sampling_strategy={0: 10, 1: 20, 2: 30},
    random_state=42,
)
print(f"After: {Counter(y_res)}")

Related Pages

Implements Principle

Requires Environment

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment