Implementation:DistrictDataLabs Yellowbrick RFECV Visualizer

Knowledge Sources	Yellowbrick Yellowbrick Docs
Domains	Machine_Learning, Model_Selection, Visualization
Last Updated	2026-02-08 00:00 GMT

Overview

Concrete tool for performing recursive feature elimination with cross-validation and visualizing the optimal number of features, provided by the Yellowbrick library.

Description

The RFECV visualizer performs recursive feature elimination with cross-validation to determine the optimal number of features for a given estimator. It wraps scikit-learn's sklearn.feature_selection.RFE and sklearn.model_selection.cross_val_score internally (note: it does not wrap sklearn.feature_selection.RFECV because it needs access to the internals of both the CV and RFE processes for visualization).

The class extends ModelVisualizer from the Yellowbrick base module. When fit(X, y) is called, the visualizer creates feature subset sizes based on the total number of features and the step parameter. For each subset size, it configures an RFE instance, performs cross-validation, and collects the scores. The subset with the highest mean cross-validated score is selected as optimal. A final RFE model is fit with that optimal number of features and stored as rfe_estimator_, which is also set as the wrapped model so the visualizer can be used directly for predictions.

The visualization plots the mean cross-validated score against the number of features selected, with a shaded band for one standard deviation. A vertical dashed line marks the optimal number of features.

Usage

Use this visualizer when you need to determine the optimal number of features for a model that exposes coef_ or feature_importances_ after fitting. The fitted visualizer can also serve as a predictor since it wraps the final RFE estimator.

Code Reference

Source Location

Repository: yellowbrick
File: yellowbrick/model_selection/rfecv.py
Class Lines: L35-261 (class), L140-142 (__init__), L153-219 (fit)
Quick Method Lines: L268-365

Signature

class RFECV(ModelVisualizer):
    def __init__(
        self, estimator, ax=None, step=1, groups=None, cv=None, scoring=None, **kwargs
    ):

Import

from yellowbrick.model_selection import RFECV

I/O Contract

Inputs

Name	Type	Required	Description
estimator	scikit-learn estimator	Yes	A model with `coef_` or `feature_importances_` after fitting. Cloned for each validation.
ax	matplotlib.Axes	No	The axes object to plot on. Default: None (current axes).
step	int or float	No	Number of features to remove per iteration (int >= 1) or fraction to remove (0.0 < float < 1.0). Default: 1.
groups	array-like, shape (n_samples,)	No	Group labels for train/test splitting. Default: None.
cv	int, CV generator, or iterable	No	Cross-validation splitting strategy. Default: None (3-fold).
scoring	string, callable, or None	No	Scoring metric. Default: None (estimator's default scorer).

The fit(X, y) method accepts:

Name	Type	Required	Description
X	array-like, shape (n_samples, n_features)	Yes	Training feature matrix.
y	array-like, shape (n_samples,)	No	Target values for classification or regression.

Outputs

Name	Type	Description
n_features_	int	The number of features in the selected optimal subset.
support_	array, shape (n_features,)	Boolean mask of selected features.
ranking_	array, shape (n_features,)	Feature ranking where rank 1 indicates a selected feature.
cv_scores_	array, shape (n_subsets, n_splits)	Cross-validation scores for each feature subset and CV split.
rfe_estimator_	sklearn.feature_selection.RFE	The fitted RFE estimator wrapping the original model.
n_feature_subsets_	array, shape (n_subsets,)	The number of features evaluated at each RFE iteration.

Usage Examples

Basic Usage

from sklearn.ensemble import RandomForestClassifier
from yellowbrick.model_selection import RFECV

# Create and fit the visualizer
viz = RFECV(RandomForestClassifier(n_estimators=100), cv=5, scoring="f1_weighted")
viz.fit(X_train, y_train)
viz.show()

# The visualizer can also be used for predictions
y_pred = viz.predict(X_test)

Quick Method

from yellowbrick.model_selection import rfecv
from sklearn.ensemble import RandomForestClassifier

rfecv(RandomForestClassifier(n_estimators=100), X_train, y_train, cv=5)

Related Pages

Implements Principle

Principle:DistrictDataLabs_Yellowbrick_Recursive_Feature_Elimination

Requires Environment

Environment:DistrictDataLabs_Yellowbrick_Python_Scikit_Learn_Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment