Implementation:Scikit learn contrib Imbalanced learn SMOTEENN
| Knowledge Sources | |
|---|---|
| Domains | Machine_Learning, Data_Preprocessing, Imbalanced_Learning |
| Last Updated | 2026-02-09 03:00 GMT |
Overview
Concrete tool for combined SMOTE oversampling and Edited Nearest Neighbours cleaning provided by the imbalanced-learn library.
Description
The SMOTEENN class combines SMOTE oversampling with ENN under-sampling in a two-step process. First it applies SMOTE to balance the dataset, then uses Edited Nearest Neighbours to remove any sample whose class label is misclassified by its neighbors, cleaning noisy or ambiguous regions. Users can provide custom SMOTE and ENN instances.
Usage
Import this class when SMOTE alone introduces noise near the decision boundary and a cleaning step is desired. SMOTEENN provides more aggressive cleaning than SMOTETomek.
Code Reference
Source Location
- Repository: imbalanced-learn
- File: imblearn/combine/_smote_enn.py
- Lines: L25-160
Signature
class SMOTEENN(BaseSampler):
def __init__(
self,
*,
sampling_strategy="auto",
random_state=None,
smote=None,
enn=None,
n_jobs=None,
):
"""
Args:
sampling_strategy: str, dict, or callable - Resampling ratio.
random_state: int, RandomState, or None - Seed.
smote: SMOTE or None - SMOTE oversampler (default: SMOTE()).
enn: EditedNearestNeighbours or None - ENN cleaner (default: ENN()).
n_jobs: int or None - Parallel jobs.
"""
Import
from imblearn.combine import SMOTEENN
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| X | {array-like, sparse matrix} of shape (n_samples, n_features) | Yes | Feature matrix |
| y | array-like of shape (n_samples,) | Yes | Target labels |
| smote | SMOTE or None | No | Custom SMOTE instance (default: SMOTE()) |
| enn | EditedNearestNeighbours or None | No | Custom ENN instance (default: ENN()) |
Outputs
| Name | Type | Description |
|---|---|---|
| X_resampled | ndarray of shape (n_samples_new, n_features) | Oversampled and cleaned feature matrix |
| y_resampled | ndarray of shape (n_samples_new,) | Cleaned target array |
Usage Examples
from collections import Counter
from sklearn.datasets import make_classification
from imblearn.combine import SMOTEENN
X, y = make_classification(
n_classes=2, weights=[0.1, 0.9], n_samples=1000, random_state=10
)
smote_enn = SMOTEENN(random_state=42)
X_res, y_res = smote_enn.fit_resample(X, y)
print(f"Resampled: {Counter(y_res)}")