Principle:DistrictDataLabs Yellowbrick PCA Projection

Knowledge Sources	Yellowbrick Docs Yellowbrick Pearson, 1901 Hotelling, 1933
Domains	Machine_Learning, Feature_Analysis, Visualization
Last Updated	2026-02-08 00:00 GMT

Overview

Principal Component Analysis (PCA) projection is a linear dimensionality reduction technique that projects high-dimensional data onto its largest variance directions, enabling visualization of feature structure in two or three dimensions.

Description

PCA finds the orthogonal directions (principal components) along which the data varies the most. By projecting the data onto the first two or three principal components, the technique produces a low-dimensional representation that preserves the maximum amount of variance from the original feature space. This projection is commonly used for visualization because it provides the most informative linear view of the data in the fewest dimensions.

Before applying PCA, it is standard practice to center (and optionally scale) the data so that each feature contributes proportionally to the variance computation. When features have different units or magnitudes, scaling with standard deviation ensures that no single feature dominates the principal components simply because of its scale.

The resulting scatter plot in 2D or 3D can be colored by target class or regression value, revealing how well the classes separate in the principal component space. Additionally, biplots can be produced by projecting the original feature axes into the principal component space, showing which features contribute most to each component.

Usage

PCA projection is used to:

Visualize high-dimensional data in 2D or 3D for exploratory analysis.
Assess class separability in the most variance-preserving linear subspace.
Identify dominant features through biplot arrows showing feature contributions.
Detect outliers that appear far from the main data cloud in the projected space.
Diagnose preprocessing by verifying that scaling produces well-distributed components.

Theoretical Basis

Eigenvalue Decomposition

Given a centered data matrix $𝐗 \in ℝ^{n \times m}$ , PCA computes the covariance matrix:

$𝐂 = \frac{1}{n - 1} 𝐗^{⊤} 𝐗$

and finds its eigendecomposition:

$𝐂 = 𝐕 Λ 𝐕^{⊤}$

where $𝐕$ contains the eigenvectors (principal components) and $Λ$ is a diagonal matrix of eigenvalues $λ_{1} \geq λ_{2} \geq \dots \geq λ_{m}$ .

Projection

The projection onto the first $k$ principal components is:

$𝐗_{p} = 𝐗 𝐕_{k}$

where $𝐕_{k}$ contains the first $k$ columns of $𝐕$ . The fraction of total variance explained by the first $k$ components is:

$explained variance ratio = \frac{\sum_{i = 1}^{k} λ_{i}}{\sum_{i = 1}^{m} λ_{i}}$

Biplots

In a biplot, the rows of the component matrix $𝐕_{k}^{⊤}$ (the loadings) are drawn as arrows from the origin. The direction and length of each arrow indicate how much each original feature contributes to the displayed principal components.

Related Pages

Implemented By

Implementation:DistrictDataLabs_Yellowbrick_PCA_Visualizer

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment