Principle:DistrictDataLabs Yellowbrick Feature Ranking

Knowledge Sources	Yellowbrick Docs Yellowbrick Shapiro & Wilk, 1965
Domains	Machine_Learning, Feature_Analysis, Visualization
Last Updated	2026-02-08 00:00 GMT

Overview

Feature ranking is the process of scoring and ordering individual features or feature pairs according to a statistical measure of their quality, relevance, or distributional characteristics.

Description

Feature ranking assigns a numeric score to each feature (in the univariate case) or to each pair of features (in the bivariate case) using a chosen statistical algorithm, then presents those scores visually so that analysts can quickly identify the most informative or problematic dimensions in a dataset.

In the univariate (1D) case, each feature is evaluated independently. A common algorithm is the Shapiro-Wilk test, which measures how closely a feature's distribution resembles a normal distribution. Features with high Shapiro-Wilk scores are approximately Gaussian, which can be important for algorithms that assume normality.

In the bivariate (2D) case, every pair of features is compared to produce a symmetric matrix of scores. Common algorithms include Pearson correlation (linear relationship), Spearman rank correlation (monotonic relationship), Kendall tau (ordinal association), and covariance (joint variability). These pairwise scores reveal redundancy between features, potential multicollinearity, and clusters of correlated variables.

Usage

Feature ranking is used during exploratory data analysis and feature selection to:

Identify uninformative features that have low variance or unusual distributions.
Detect multicollinearity by finding pairs of features with very high correlation.
Guide feature selection by revealing which features carry independent information.
Validate preprocessing by confirming that normalization or scaling has produced the expected distributional properties.

It is especially useful when the number of features is moderate (up to a few hundred), since the pairwise comparison produces an $n \times n$ matrix that becomes unwieldy for very high-dimensional data.

Theoretical Basis

Univariate Ranking: Shapiro-Wilk Test

The Shapiro-Wilk test statistic $W$ is defined as:

$W = \frac{{(\sum_{i = 1}^{n} a_{i} x_{(i)})}^{2}}{\sum_{i = 1}^{n} (x_{i} - \bar{x})^{2}}$

where $x_{(i)}$ are the ordered sample values, $\bar{x}$ is the sample mean, and $a_{i}$ are constants generated from the expected values of order statistics of a standard normal distribution. A value of $W$ close to 1 indicates normality.

Bivariate Ranking: Pearson Correlation

The Pearson correlation coefficient between features $X_{j}$ and $X_{k}$ is:

$r_{j k} = \frac{\sum_{i = 1}^{n} (x_{i j} - {\bar{x}}_{j}) (x_{i k} - {\bar{x}}_{k})}{\sqrt{\sum_{i = 1}^{n} (x_{i j} - {\bar{x}}_{j})^{2} \sum_{i = 1}^{n} (x_{i k} - {\bar{x}}_{k})^{2}}}$

This yields values in $[- 1, 1]$ , where $| r | = 1$ denotes perfect linear dependence.

Bivariate Ranking: Spearman Rank Correlation

Spearman's $ρ$ applies the Pearson formula to the rank-transformed data, making it robust to non-linear but monotonic relationships.

Bivariate Ranking: Kendall Tau

Kendall's $τ$ counts the number of concordant and discordant pairs:

$τ = \frac{(concordant pairs) - (discordant pairs)}{(\binom{n}{2})}$

Related Pages

Implemented By

Implementation:DistrictDataLabs_Yellowbrick_Rank2D_Visualizer

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment