Principle:DistrictDataLabs Yellowbrick Class Prediction Error Analysis

Knowledge Sources	Yellowbrick Docs Yellowbrick
Domains	Machine_Learning, Classification, Model_Evaluation
Last Updated	2026-02-08 00:00 GMT

Overview

Class prediction error analysis is a diagnostic technique that visualizes how instances of each true class are distributed across predicted classes, using a stacked bar chart to reveal the nature and magnitude of misclassifications.

Description

Class prediction error analysis provides an alternative view of classifier performance that complements the confusion matrix. Instead of a grid of cells, it presents one bar per true class, with each bar segmented (stacked) by the predicted class labels. The height of the entire bar represents the support (total number of true instances) for that class, while each colored segment shows how many instances were predicted as a particular class. A perfectly accurate model would show bars consisting entirely of the correctly-predicted segment, while misclassifications appear as additional colored segments in the stack.

This visualization is particularly effective at revealing class imbalance and its interaction with prediction errors. When one class dominates the dataset, its bar will be much taller, making the scale of misclassification across classes immediately visible. The stacked nature of the chart also makes it easy to see which specific classes absorb the misclassified instances: if instances of class A are frequently misclassified as class B, the segment for class B will be prominently visible in class A's bar.

Class prediction error analysis fits within the model evaluation stage of a classification workflow. It is typically applied after training and scoring a classifier on a test set, and serves as a visual complement to accuracy scores, classification reports, and confusion matrices. It is especially useful when communicating results to non-technical stakeholders, as the bar chart format is generally more intuitive than a confusion matrix heatmap.

Usage

Use class prediction error analysis when you want an intuitive, bar-chart-based view of how well a classifier handles each class. It is especially useful for identifying which classes are problematic, understanding how misclassified instances are distributed across classes, and communicating classification performance to audiences unfamiliar with confusion matrices.

Theoretical Basis

Given true labels $𝐲$ and predicted labels $\hat{𝐲}$ over a set of $n$ instances with $| C |$ classes, the class prediction error constructs a matrix $P$ of size $| C | \times | C |$ where:

$P_{i, j} = | {k : y_{k} = c_{i} \land {\hat{y}}_{k} = c_{j}} |$

This is identical to the confusion matrix, but the visualization renders it differently. Each row $i$ of the matrix becomes a stacked bar, where the bar segments correspond to the columns $j$ :

The segment for $j = i$ (correct predictions) represents the true positive count for class $c_{i}$
The segments for $j \neq i$ represent misclassifications of class $c_{i}$ into other classes

The total height of bar $i$ is the support for class $c_{i}$ :

${Support}_{i} = \sum_{j} P_{i, j} = | {k : y_{k} = c_{i}} |$

The fraction of bar $i$ occupied by segment $j$ is:

$\frac{P_{i, j}}{{Support}_{i}}$

This provides a direct visual encoding of per-class recall: the proportion of the bar occupied by the correct-prediction segment equals the recall for that class.

Related Pages

Implemented By

Implementation:DistrictDataLabs_Yellowbrick_ClassPredictionError_Visualizer

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment