Implementation:Scikit learn Scikit learn RFE
| Knowledge Sources | |
|---|---|
| Domains | Feature Selection, Model Evaluation |
| Last Updated | 2026-02-08 15:00 GMT |
Overview
Concrete tool for feature ranking with recursive feature elimination provided by scikit-learn.
Description
RFE (Recursive Feature Elimination) selects features by recursively considering smaller and smaller sets of features. First, the estimator is trained on the initial set of features and the importance of each feature is obtained. Then, the least important features are pruned from the current set. This process is repeated recursively until the desired number of features is reached. The module also provides RFECV, which performs RFE with cross-validation to automatically select the optimal number of features.
Usage
Use RFE when you want to select features by recursively removing the least important ones based on an external estimator's feature weights. Use RFECV when you additionally want to automatically determine the optimal number of features through cross-validation. Both are effective with models that expose feature importance via coef_ or feature_importances_ attributes.
Code Reference
Source Location
- Repository: scikit-learn
- File: sklearn/feature_selection/_rfe.py
Signature
class RFE(SelectorMixin, MetaEstimatorMixin, BaseEstimator):
def __init__(
self,
estimator,
*,
n_features_to_select=None,
step=1,
verbose=0,
importance_getter="auto",
):
class RFECV(RFE):
def __init__(
self,
estimator,
*,
step=1,
min_features_to_select=1,
cv=None,
scoring=None,
verbose=0,
n_jobs=None,
importance_getter="auto",
):
Import
from sklearn.feature_selection import RFE
from sklearn.feature_selection import RFECV
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| estimator | estimator instance | Yes | A supervised learning estimator with a fit method that provides feature importance via coef_ or feature_importances_. |
| n_features_to_select | int or float | No | Number of features to select. If None, half of the features are selected. Default is None. |
| step | int or float | No | Number (or fraction) of features to remove at each iteration. Default is 1. |
| verbose | int | No | Controls verbosity of output. Default is 0. |
| importance_getter | str or callable | No | How to get feature importances. Default is "auto". |
| cv | int, cross-validation generator, or iterable | No | Cross-validation splitting strategy for RFECV. Default is None (5-fold). |
| scoring | str or callable | No | Scoring strategy for RFECV. Default is None (estimator's default scorer). |
Outputs
| Name | Type | Description |
|---|---|---|
| X_transformed | ndarray | The input data with only the selected features retained. |
| support_ | ndarray of shape (n_features,) | Boolean mask of selected features. |
| ranking_ | ndarray of shape (n_features,) | Feature ranking (1 = selected). |
| n_features_ | int | The number of selected features. |
| estimator_ | estimator instance | The fitted estimator used to select features. |
Usage Examples
Basic Usage
from sklearn.feature_selection import RFE
from sklearn.svm import SVC
from sklearn.datasets import make_classification
X, y = make_classification(n_samples=200, n_features=20, n_informative=5, random_state=42)
estimator = SVC(kernel="linear")
selector = RFE(estimator, n_features_to_select=5, step=1)
selector = selector.fit(X, y)
print(f"Selected features: {selector.support_}")
print(f"Feature ranking: {selector.ranking_}")
X_selected = selector.transform(X)
print(f"Shape after selection: {X_selected.shape}")