Principle:Scikit learn contrib Imbalanced learn Borderline Oversampling

Knowledge Sources	Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning
Domains	Machine_Learning, Data_Preprocessing, Imbalanced_Learning
Last Updated	2026-02-09 03:00 GMT

Overview

A focused oversampling technique that generates synthetic samples only from minority instances near the decision boundary (borderline samples), where classification is most uncertain.

Description

Borderline SMOTE refines standard SMOTE by first identifying minority samples that lie on the borderline between classes. A minority sample is classified as borderline (or "in danger") if among its m nearest neighbors, roughly half belong to the majority class. Only these borderline samples are used for synthetic generation.

Two variants exist:

Borderline-1: Generates synthetic samples only between the borderline minority sample and its minority nearest neighbors.
Borderline-2: Additionally allows interpolation toward majority class nearest neighbors, pushing the boundary further into majority territory.

This targeted approach is more effective than uniform SMOTE because the decision boundary region is where the classifier struggles most.

Usage

Use this principle when:

The minority class has a clear borderline region with the majority class
Standard SMOTE generates too many samples in safe, interior regions
The goal is to strengthen the decision boundary specifically
Borderline-1 is preferred for conservative expansion; Borderline-2 for aggressive expansion

Theoretical Basis

The algorithm operates in two phases:

Phase 1 - Borderline Detection: For each minority sample x_i, count its m nearest neighbors from all classes. If the number of majority neighbors is between m/2 and m (exclusive), mark x_i as a borderline ("danger") sample.

Phase 2 - Synthetic Generation: Apply SMOTE interpolation only to the set of borderline minority samples.

# Abstract Borderline-SMOTE algorithm (NOT real implementation)
DANGER = set()
for each minority_sample x_i:
    m_neighbors = m_nearest_neighbors(x_i, m, all_classes=True)
    majority_count = count_majority(m_neighbors)
    if m/2 <= majority_count < m:
        DANGER.add(x_i)

# Only oversample borderline samples
for x_i in DANGER:
    apply_smote_interpolation(x_i, k_neighbors)

Related Pages

Implemented By

Implementation:Scikit_learn_contrib_Imbalanced_learn_BorderlineSMOTE

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment