Implementation:Scikit learn contrib Imbalanced learn ADASYN
Appearance
| Knowledge Sources | |
|---|---|
| Domains | Machine_Learning, Data_Preprocessing, Imbalanced_Learning |
| Last Updated | 2026-02-09 03:00 GMT |
Overview
Concrete tool for adaptive synthetic minority oversampling provided by the imbalanced-learn library.
Description
The ADASYN class implements the Adaptive Synthetic Sampling algorithm. It extends BaseOverSampler and generates different numbers of synthetic samples per minority instance based on the local density of majority class neighbors. Instances near the decision boundary receive more synthetic samples.
Usage
Import this class when the minority class has regions of varying difficulty and you want the oversampler to focus on harder-to-classify areas rather than uniformly oversampling.
Code Reference
Source Location
- Repository: imbalanced-learn
- File: imblearn/over_sampling/_adasyn.py
- Lines: L23-215
Signature
class ADASYN(BaseOverSampler):
def __init__(
self,
*,
sampling_strategy="auto",
random_state=None,
n_neighbors=5,
):
"""
Args:
sampling_strategy: str, dict, or callable - Desired ratio of
minority to majority samples. 'auto' equalizes all classes.
random_state: int, RandomState, or None - Seed for reproducibility.
n_neighbors: int or NearestNeighbors - Number of nearest neighbors
used to compute density ratio and generate samples (default: 5).
"""
Import
from imblearn.over_sampling import ADASYN
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| X | {array-like, sparse matrix, dataframe} of shape (n_samples, n_features) | Yes | Feature matrix of training data |
| y | array-like of shape (n_samples,) | Yes | Target labels |
| sampling_strategy | str, dict, or callable | No | Resampling ratio (default: 'auto') |
| n_neighbors | int or NearestNeighbors | No | Neighbors for density estimation (default: 5) |
| random_state | int, RandomState, or None | No | Random seed |
Outputs
| Name | Type | Description |
|---|---|---|
| X_resampled | {ndarray, sparse matrix, dataframe} of shape (n_samples_new, n_features) | Feature matrix with adaptive synthetic samples added |
| y_resampled | ndarray of shape (n_samples_new,) | Target array with labels for synthetic samples |
Usage Examples
Basic Usage
from collections import Counter
from sklearn.datasets import make_classification
from imblearn.over_sampling import ADASYN
# Create imbalanced dataset
X, y = make_classification(
n_classes=2, class_sep=2, weights=[0.1, 0.9],
n_informative=3, n_redundant=1, flip_y=0,
n_features=20, n_clusters_per_class=1,
n_samples=1000, random_state=10,
)
print(f"Original: {Counter(y)}")
# Apply ADASYN
adasyn = ADASYN(random_state=42)
X_res, y_res = adasyn.fit_resample(X, y)
print(f"Resampled: {Counter(y_res)}")
In a Pipeline
from imblearn.pipeline import make_pipeline
from imblearn.over_sampling import ADASYN
from sklearn.tree import DecisionTreeClassifier
pipeline = make_pipeline(ADASYN(random_state=42), DecisionTreeClassifier())
pipeline.fit(X_train, y_train)
y_pred = pipeline.predict(X_test)
Related Pages
Implements Principle
Requires Environment
Uses Heuristic
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment