Implementation:DistrictDataLabs Yellowbrick ConfusionMatrix Visualizer
| Knowledge Sources | |
|---|---|
| Domains | Machine_Learning, Classification, Visualization |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
Concrete tool for rendering a confusion matrix as a color-coded heatmap of true versus predicted class labels, provided by the Yellowbrick library.
Description
The ConfusionMatrix class is a classification score visualizer that computes the scikit-learn confusion matrix and renders it as a heatmap. Each cell shows the count (or percentage) of instances for a given true-class/predicted-class pair. The diagonal is highlighted to distinguish correct predictions from misclassifications. The visualizer supports displaying raw counts or percentages of true class totals, configurable colormaps (defaulting to "YlOrRd"), sample weighting, and custom font sizes. When percentage mode is enabled, cells with 100% accuracy are highlighted in a distinct green color.
The companion quick method confusion_matrix() provides a one-call interface that instantiates, fits, scores, and renders the visualizer.
Usage
Use ConfusionMatrix after training a classifier to diagnose which classes are being confused. Import it when building model evaluation pipelines, comparing classifier performance across class boundaries, or preparing visual reports.
Code Reference
Source Location
- Repository: yellowbrick
- File: yellowbrick/classifier/confusion_matrix.py
- Class Lines: L136-224 (ConfusionMatrix class)
- Quick Method Lines: L341-488 (confusion_matrix function)
Signature
class ConfusionMatrix(ClassificationScoreVisualizer):
def __init__(
self,
estimator,
ax=None,
sample_weight=None,
percent=False,
classes=None,
encoder=None,
cmap="YlOrRd",
fontsize=None,
is_fitted="auto",
force_model=False,
**kwargs
)
def score(self, X, y)
def confusion_matrix(
estimator,
X_train,
y_train,
X_test=None,
y_test=None,
ax=None,
sample_weight=None,
percent=False,
classes=None,
encoder=None,
cmap="YlOrRd",
fontsize=None,
is_fitted="auto",
force_model=False,
show=True,
**kwargs
)
Import
from yellowbrick.classifier import ConfusionMatrix
from yellowbrick.classifier.confusion_matrix import confusion_matrix
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| estimator | sklearn classifier | Yes | A scikit-learn classifier to evaluate |
| ax | matplotlib Axes | No | Axes object on which to draw the heatmap; uses current axes if not provided |
| sample_weight | array-like of shape [n_samples] | No | Weights applied to each sample when computing the confusion matrix |
| percent | bool | No | If True, display values as percentage of true class; defaults to False (raw counts) |
| classes | list of str | No | Human-readable class labels for both axes of the matrix |
| encoder | dict or LabelEncoder | No | Mapping from target values to human-readable labels |
| cmap | str | No | Matplotlib colormap name; defaults to "YlOrRd" |
| fontsize | int or None | No | Font size for cell text and axis labels |
| is_fitted | bool or str | No | Whether the estimator is already fitted; defaults to "auto" |
| force_model | bool | No | If True, skip the classifier type check on the estimator |
Outputs
| Name | Type | Description |
|---|---|---|
| score_ | float | Global accuracy score from the underlying estimator |
| confusion_matrix_ | ndarray | The computed confusion matrix of shape (n_classes, n_classes) |
| class_counts_ | ndarray | Array of true instance counts per class |
| ax | matplotlib Axes | The axes with the rendered confusion matrix heatmap |
Usage Examples
Basic Usage
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from yellowbrick.classifier import ConfusionMatrix
from yellowbrick.datasets import load_occupancy
X, y = load_occupancy()
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42)
viz = ConfusionMatrix(LogisticRegression(), percent=True)
viz.fit(X_train, y_train)
viz.score(X_test, y_test)
viz.show()
Quick Method
from sklearn.linear_model import LogisticRegression
from yellowbrick.classifier.confusion_matrix import confusion_matrix
from yellowbrick.datasets import load_occupancy
X, y = load_occupancy()
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42)
confusion_matrix(LogisticRegression(), X_train, y_train, X_test, y_test, percent=True)