Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:DistrictDataLabs Yellowbrick Confusion Matrix Visualization

From Leeroopedia


Knowledge Sources
Domains Machine_Learning, Classification, Model_Evaluation
Last Updated 2026-02-08 00:00 GMT

Overview

Confusion matrix visualization is the practice of rendering the cross-tabulation of true versus predicted class labels as a color-coded heatmap to expose classification errors and their distribution across classes.

Description

A confusion matrix is a square matrix of size |C|×|C| where C is the set of classes. Each cell (i,j) records the count of instances whose true class is i and whose predicted class is j. The diagonal entries represent correct predictions (true positives for each class), while off-diagonal entries represent misclassifications. Visualizing this matrix as a heatmap, with color intensity proportional to cell values, transforms a dense numeric table into an immediately interpretable diagnostic tool.

Confusion matrix visualization solves the problem of quickly identifying systematic misclassification patterns. For example, if two classes are frequently confused with each other, the corresponding off-diagonal cells will be prominently colored. The visualization can display either raw counts or percentages of true class totals, where percentage mode normalizes each row by the total number of instances belonging to that true class, making it easier to compare error rates across classes of different sizes.

This technique is a fundamental part of the classification evaluation workflow, typically applied after model training and prediction on a test set. It provides complementary information to aggregate metrics like accuracy, precision, and recall by revealing the pairwise structure of errors between classes.

Usage

Use confusion matrix visualization whenever you evaluate a classifier and need to understand not just whether errors occur, but which specific classes are being confused. It is valuable for both binary and multiclass problems, and is especially informative when classes have similar characteristics that may cause systematic misclassification.

Theoretical Basis

Given a set of true labels 𝐲 and predicted labels 𝐲^, the confusion matrix M is defined as:

Mi,j=|{k:yk=ciy^k=cj}|

where ci,cjC are class labels. The diagonal elements Mi,i represent correct classifications for class ci.

When displaying as percentages of true class, each row is normalized by the row sum:

Mi,j%=Mi,jjMi,j

The global accuracy can be derived from the confusion matrix as:

Accuracy=iMi,iijMi,j=trace(M)M1

Individual class metrics can also be extracted. For class ci:

  • True Positives: TPi=Mi,i
  • False Positives: FPi=jiMj,i
  • False Negatives: FNi=jiMi,j

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment