Implementation:Scikit learn Scikit learn SequentialFeatureSelector
| Knowledge Sources | |
|---|---|
| Domains | Feature Selection, Model Evaluation |
| Last Updated | 2026-02-08 15:00 GMT |
Overview
Concrete tool for performing sequential forward or backward feature selection provided by scikit-learn.
Description
SequentialFeatureSelector adds (forward selection) or removes (backward selection) features to form a feature subset in a greedy fashion. At each stage, it chooses the best feature to add or remove based on the cross-validation score of an estimator. Forward selection starts with no features and adds the best one at each step, while backward selection starts with all features and removes the worst one. It also supports unsupervised learning where only features (X) are considered.
Usage
Use SequentialFeatureSelector when you want a simple, greedy approach to feature selection that evaluates feature subsets using cross-validation. Forward selection is computationally cheaper when you want few features; backward selection is cheaper when you want to remove only a few features. Use the tol parameter to automatically stop when score improvements become negligible.
Code Reference
Source Location
- Repository: scikit-learn
- File: sklearn/feature_selection/_sequential.py
Signature
class SequentialFeatureSelector(SelectorMixin, MetaEstimatorMixin, BaseEstimator):
def __init__(
self,
estimator,
*,
n_features_to_select="auto",
tol=None,
direction="forward",
scoring=None,
cv=5,
n_jobs=None,
):
Import
from sklearn.feature_selection import SequentialFeatureSelector
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| estimator | estimator instance | Yes | An unfitted estimator used for evaluating feature subsets. |
| n_features_to_select | "auto", int, or float | No | Number of features to select. "auto" selects half or uses tol-based stopping. Default is "auto". |
| tol | float | No | Minimum score improvement to continue adding/removing features. Only used when n_features_to_select is "auto". Default is None. |
| direction | str | No | Direction of selection: "forward" or "backward". Default is "forward". |
| scoring | str or callable | No | Scoring metric for cross-validation. Default is None (estimator's default). |
| cv | int | No | Number of cross-validation folds. Default is 5. |
| n_jobs | int | No | Number of parallel jobs for cross-validation. Default is None. |
Outputs
| Name | Type | Description |
|---|---|---|
| X_transformed | ndarray | The input data with only the selected features. |
| support_ | ndarray of shape (n_features,) | Boolean mask of selected features. |
| n_features_to_select_ | int | The number of features that were selected. |
Usage Examples
Basic Usage
from sklearn.feature_selection import SequentialFeatureSelector
from sklearn.neighbors import KNeighborsClassifier
from sklearn.datasets import load_iris
X, y = load_iris(return_X_y=True)
knn = KNeighborsClassifier(n_neighbors=3)
sfs = SequentialFeatureSelector(knn, n_features_to_select=2, direction="forward")
sfs.fit(X, y)
print(f"Selected features: {sfs.get_support()}")
X_selected = sfs.transform(X)
print(f"Shape after selection: {X_selected.shape}")