Principle:DistrictDataLabs Yellowbrick Feature Importance Ranking

Knowledge Sources	Yellowbrick Docs Yellowbrick
Domains	Machine_Learning, Model_Selection, Hyperparameter_Tuning
Last Updated	2026-02-08 00:00 GMT

Overview

Feature importance ranking is a model interpretation technique that quantifies the contribution of each input feature to a model's predictions, enabling practitioners to identify the most informative variables and guide feature engineering decisions.

Description

In machine learning, not all input features contribute equally to a model's predictive power. Some features carry strong signal, while others may be noisy, redundant, or irrelevant. Feature importance ranking provides a quantitative measure of each feature's contribution, allowing practitioners to understand which variables drive the model's decisions.

There are two primary mechanisms by which models expose feature importance. Tree-based models such as Random Forests and Gradient Boosting Machines compute feature_importances_ based on criteria such as the total reduction in impurity (e.g. Gini impurity or entropy) that each feature provides across all splits in all trees. Linear models such as Logistic Regression and Support Vector Machines expose coef_ (coefficient) arrays that represent the weight assigned to each feature in the decision function. The magnitude (and optionally the sign) of these coefficients indicates how strongly each feature influences predictions.

For multi-class classifiers, coefficient arrays may be multi-dimensional with shape (n_classes, n_features). In such cases, importances can be aggregated by computing the mean coefficient magnitude across classes, or they can be visualized per-class using stacked representations. Feature importance rankings can be displayed as absolute values for easier comparison, or in relative terms normalized to the strongest feature. Practitioners commonly examine the top-N or bottom-N features to focus on the most or least influential variables.

Usage

Feature importance ranking should be used when:

You want to understand which features are driving your model's predictions.
You need to perform feature selection by identifying and removing low-importance features.
You want to communicate model interpretability to stakeholders.
You are debugging a model that may be relying on spurious or unexpected features.
You need to reduce model complexity by dropping uninformative features.

Theoretical Basis

Tree-Based Importance

For tree-based ensembles, the importance of a feature $j$ is computed as the total weighted reduction in the splitting criterion across all nodes where feature $j$ is used:

$Imp (j) = \sum_{t \in T_{j}} \frac{n_{t}}{n} Δ I (t)$

where $T_{j}$ is the set of all tree nodes that split on feature $j$ , $n_{t} / n$ is the fraction of samples reaching node $t$ , and $Δ I (t)$ is the impurity decrease at that node.

For Gini impurity:

$I_{G} (t) = 1 - \sum_{c = 1}^{C} p_{t c}^{2}$

where $p_{t c}$ is the proportion of class $c$ samples at node $t$ .

Coefficient-Based Importance

For linear models, the importance of feature $j$ is derived from the model coefficient:

$Imp (j) = | w_{j} |$

For multi-class models with coefficient matrix $W \in ℝ^{C \times p}$ , the aggregated importance is:

$Imp (j) = \frac{1}{C} \sum_{c = 1}^{C} | W_{c j} |$

Relative Importance

Relative importance normalizes all values to the maximum:

$RelImp (j) = 100 \cdot \frac{| Imp (j) |}{\max_{k} | Imp (k) |}$

Related Pages

Implemented By

Implementation:DistrictDataLabs_Yellowbrick_FeatureImportances_Visualizer

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment