Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Scikit learn contrib Imbalanced learn ADASYN

From Leeroopedia


Knowledge Sources
Domains Machine_Learning, Data_Preprocessing, Imbalanced_Learning
Last Updated 2026-02-09 03:00 GMT

Overview

Concrete tool for adaptive synthetic minority oversampling provided by the imbalanced-learn library.

Description

The ADASYN class implements the Adaptive Synthetic Sampling algorithm. It extends BaseOverSampler and generates different numbers of synthetic samples per minority instance based on the local density of majority class neighbors. Instances near the decision boundary receive more synthetic samples.

Usage

Import this class when the minority class has regions of varying difficulty and you want the oversampler to focus on harder-to-classify areas rather than uniformly oversampling.

Code Reference

Source Location

  • Repository: imbalanced-learn
  • File: imblearn/over_sampling/_adasyn.py
  • Lines: L23-215

Signature

class ADASYN(BaseOverSampler):
    def __init__(
        self,
        *,
        sampling_strategy="auto",
        random_state=None,
        n_neighbors=5,
    ):
        """
        Args:
            sampling_strategy: str, dict, or callable - Desired ratio of
                minority to majority samples. 'auto' equalizes all classes.
            random_state: int, RandomState, or None - Seed for reproducibility.
            n_neighbors: int or NearestNeighbors - Number of nearest neighbors
                used to compute density ratio and generate samples (default: 5).
        """

Import

from imblearn.over_sampling import ADASYN

I/O Contract

Inputs

Name Type Required Description
X {array-like, sparse matrix, dataframe} of shape (n_samples, n_features) Yes Feature matrix of training data
y array-like of shape (n_samples,) Yes Target labels
sampling_strategy str, dict, or callable No Resampling ratio (default: 'auto')
n_neighbors int or NearestNeighbors No Neighbors for density estimation (default: 5)
random_state int, RandomState, or None No Random seed

Outputs

Name Type Description
X_resampled {ndarray, sparse matrix, dataframe} of shape (n_samples_new, n_features) Feature matrix with adaptive synthetic samples added
y_resampled ndarray of shape (n_samples_new,) Target array with labels for synthetic samples

Usage Examples

Basic Usage

from collections import Counter
from sklearn.datasets import make_classification
from imblearn.over_sampling import ADASYN

# Create imbalanced dataset
X, y = make_classification(
    n_classes=2, class_sep=2, weights=[0.1, 0.9],
    n_informative=3, n_redundant=1, flip_y=0,
    n_features=20, n_clusters_per_class=1,
    n_samples=1000, random_state=10,
)
print(f"Original: {Counter(y)}")

# Apply ADASYN
adasyn = ADASYN(random_state=42)
X_res, y_res = adasyn.fit_resample(X, y)
print(f"Resampled: {Counter(y_res)}")

In a Pipeline

from imblearn.pipeline import make_pipeline
from imblearn.over_sampling import ADASYN
from sklearn.tree import DecisionTreeClassifier

pipeline = make_pipeline(ADASYN(random_state=42), DecisionTreeClassifier())
pipeline.fit(X_train, y_train)
y_pred = pipeline.predict(X_test)

Related Pages

Implements Principle

Requires Environment

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment