Principle:Scikit learn contrib Imbalanced learn Adaptive Synthetic Sampling

Knowledge Sources	ADASYN: Adaptive Synthetic Sampling Approach for Imbalanced Learning
Domains	Machine_Learning, Data_Preprocessing, Imbalanced_Learning
Last Updated	2026-02-09 03:00 GMT

Overview

An adaptive oversampling technique that generates more synthetic samples for minority instances that are harder to learn, based on the density of majority neighbors.

Description

Adaptive Synthetic Sampling (ADASYN) extends the SMOTE approach by adaptively adjusting the number of synthetic samples generated for each minority instance based on its local difficulty. Minority samples surrounded by more majority class neighbors (i.e., harder to learn) receive more synthetic samples, while those in safer regions receive fewer.

This adaptive weighting shifts the classification boundary toward the difficult examples, focusing the learning effort where it matters most. ADASYN was proposed by He et al. (2008) and addresses a key limitation of standard SMOTE, which treats all minority samples equally regardless of their learning difficulty.

Usage

Use this principle when:

Standard SMOTE produces too many synthetic samples in easy-to-classify regions
The focus should be on minority samples near the decision boundary
The dataset has varying difficulty across minority class regions
A density-adaptive approach is preferred over uniform oversampling

Theoretical Basis

ADASYN computes a density ratio for each minority sample:

For each minority sample x_i, compute $r_{i} = \frac{Δ_{i}}{k}$ where $Δ_{i}$ is the number of majority-class neighbors among its k nearest neighbors
Normalize: ${\hat{r}}_{i} = r_{i} / \sum_{j} r_{j}$
Generate $g_{i} = {\hat{r}}_{i} \times G$ synthetic samples for x_i, where G is the total number of synthetic samples needed

Pseudo-code:

# Abstract ADASYN algorithm (NOT real implementation)
G = total_synthetic_samples_needed
for each minority_sample x_i:
    majority_neighbors = count_majority_in_k_neighbors(x_i, k)
    r_i = majority_neighbors / k
r_normalized = normalize(r_values)
for each minority_sample x_i:
    g_i = round(r_normalized[i] * G)
    generate g_i synthetic samples near x_i using SMOTE interpolation

Related Pages

Implemented By

Implementation:Scikit_learn_contrib_Imbalanced_learn_ADASYN

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment