Implementation:DistrictDataLabs Yellowbrick PCA Visualizer
| Knowledge Sources | |
|---|---|
| Domains | Machine_Learning, Feature_Analysis, Visualization |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
Concrete tool for PCA-based dimensionality reduction and scatter plot visualization provided by the Yellowbrick library.
Description
The PCA visualizer (aliased as PCADecomposition) produces a two- or three-dimensional principal component plot of data projected onto its largest sequential principal components. Internally, it constructs a scikit-learn Pipeline containing a StandardScaler (controlled by the scale parameter) followed by a sklearn.decomposition.PCA transformer. The resulting scatter plot can be colored by discrete class labels or continuous target values. Optional features include biplot arrows (showing feature contributions to each component) and a heatmap below the scatter plot displaying the magnitude of each feature in the principal components.
Usage
Use the PCA visualizer when you want to reduce a high-dimensional feature space to 2D or 3D for visual exploration. It is compatible with both classification (discrete target) and regression (continuous target) problems. Enable proj_features=True for biplot arrows or heatmap=True for the component contribution heatmap. Note that the heatmap is not compatible with 3D projections.
Code Reference
Source Location
- Repository: yellowbrick
- File: yellowbrick/features/pca.py
- Lines: PCA class at L43-458, quick method at L466-627
Signature
class PCA(ProjectionVisualizer):
def __init__(
self,
ax=None,
features=None,
classes=None,
scale=True,
projection=2,
proj_features=False,
colors=None,
colormap=None,
alpha=0.75,
random_state=None,
colorbar=True,
heatmap=False,
**kwargs
):
Import
from yellowbrick.features import PCA
# or equivalently:
from yellowbrick.features.pca import PCADecomposition
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| ax | matplotlib Axes | No | The axes to plot on. If None, current axes are used or generated. |
| features | list | No | Feature names. Inferred from DataFrame columns if not provided. |
| classes | list | No | Class labels for the legend (discrete target only). Inferred from y if not provided. |
| scale | bool | No | Whether to scale data with StandardScaler before PCA. Default True.
|
| projection | int | No | Number of dimensions to project into: 2 (default) or 3.
|
| proj_features | bool | No | If True, draw biplot arrows showing feature contributions. Default False.
|
| colors | list or tuple | No | Colors for each class or a single color for all points. |
| colormap | str or cmap | No | Matplotlib colormap for coloring points. |
| alpha | float | No | Transparency of scatter points. Default 0.75.
|
| random_state | int, RandomState, or None | No | Random state for the PCA solver (used when randomized solver is enabled). Default None.
|
| colorbar | bool | No | If True and target is continuous, draw a colorbar. Default True.
|
| heatmap | bool | No | If True, draw a heatmap of feature contributions below the scatter plot. Not compatible with 3D. Default False.
|
Outputs
| Name | Type | Description |
|---|---|---|
| pca_components_ | ndarray, shape (n_components, n_features) | The principal component loadings matrix from the fitted PCA transformer. |
| classes_ | ndarray, shape (n_classes,) | Class labels (discrete target only). |
| features_ | ndarray, shape (n_features,) | Feature names discovered or provided during fit. |
| Xp (return from transform) | ndarray, shape (n, projection) | The PCA-transformed feature matrix. |
| ax | matplotlib Axes | The axes object containing the rendered scatter plot. |
Usage Examples
Basic Usage
from yellowbrick.features import PCA as PCAVisualizer
from sklearn.datasets import load_iris
X, y = load_iris(return_X_y=True)
visualizer = PCAVisualizer(
scale=True,
projection=2,
classes=["setosa", "versicolor", "virginica"],
proj_features=True,
)
visualizer.fit(X, y)
visualizer.transform(X, y)
visualizer.show()
3D Projection
from yellowbrick.features import PCA as PCAVisualizer
from sklearn.datasets import load_iris
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
X, y = load_iris(return_X_y=True)
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
visualizer = PCAVisualizer(ax=ax, projection=3, colors=['r', 'g', 'b'])
visualizer.fit(X, y)
visualizer.transform(X, y)
visualizer.show()
Quick Method
from yellowbrick.features import pca_decomposition
from sklearn.datasets import load_iris
X, y = load_iris(return_X_y=True)
pca_decomposition(X, y, scale=True, projection=2, heatmap=True)