Implementation:DistrictDataLabs Yellowbrick ResidualsPlot Visualizer

Knowledge Sources	Yellowbrick Yellowbrick Docs
Domains	Machine_Learning, Regression, Visualization
Last Updated	2026-02-08 00:00 GMT

Overview

Concrete tool for visualizing regression residuals provided by the Yellowbrick library.

Description

The ResidualsPlot visualizer plots the residuals (predicted value minus actual value) on the vertical axis against the predicted values on the horizontal axis for a fitted regression model. It supports separate coloring and opacity for training and test data splits, allowing direct visual comparison of residual behavior. A horizontal zero line is drawn to serve as the baseline for residual evaluation.

Optionally, a histogram of the residuals can be appended to the right side of the scatter plot to inspect the distribution of errors. This histogram can display either raw frequency counts or a probability density estimate. As an alternative to the histogram, a Q-Q (quantile-quantile) plot can be shown instead, comparing the residual quantiles against a standard normal distribution. The histogram and Q-Q plot are mutually exclusive and cannot be shown simultaneously.

The visualizer wraps a Scikit-Learn regressor and extends RegressionScoreVisualizer. Its primary entry points are the fit() method (which fits the estimator and draws training residuals) and the score() method (which generates predictions on test data and draws test residuals). Both train and test $R^{2}$ scores are displayed in the legend.

Usage

Use ResidualsPlot when you need to:

Visually diagnose whether a linear regression model is appropriate for the data
Check for heteroscedasticity or non-random patterns in the residuals
Compare training versus test residual distributions
Inspect the normality of residuals via a histogram or Q-Q plot

Code Reference

Source Location

Repository: yellowbrick
File: yellowbrick/regressor/residuals.py
Class: Lines 47-401
Quick Method: Lines 408-556

Signature

class ResidualsPlot(RegressionScoreVisualizer):
    def __init__(
        self,
        estimator,
        ax=None,
        hist=True,
        qqplot=False,
        train_color="b",
        test_color="g",
        line_color=LINE_COLOR,
        train_alpha=0.75,
        test_alpha=0.75,
        is_fitted="auto",
        **kwargs
    )

Import

from yellowbrick.regressor import ResidualsPlot

I/O Contract

Inputs

Name	Type	Required	Description
estimator	Scikit-Learn regressor	Yes	A regression estimator instance to wrap. Must be a regressor or a `YellowbrickTypeError` is raised.
ax	matplotlib Axes	No	The axes to plot on. If `None`, the current axes are used or created.
hist	bool, str, or None	No	Controls the residuals histogram. `True` or `'frequency'` shows frequency; `'density'` shows probability density; `False` or `None` disables. Default: `True`. Requires Matplotlib >= 2.0.2.
qqplot	bool	No	If `True`, draws a Q-Q plot instead of a histogram. Cannot be `True` simultaneously with `hist`. Default: `False`.
train_color	color	No	Color for training data residual points. Default: `'b'` (blue).
test_color	color	No	Color for test data residual points. Default: `'g'` (green).
line_color	color	No	Color for the zero error line. Default: dark grey.
train_alpha	float	No	Transparency for training data points (0=transparent, 1=opaque). Default: `0.75`.
test_alpha	float	No	Transparency for test data points (0=transparent, 1=opaque). Default: `0.75`.
is_fitted	bool or str	No	Whether the estimator is already fitted. `False` means it will be fit during `visualizer.fit()`; `'auto'` checks automatically. Default: `'auto'`.

Outputs

Name	Type	Description
train_score_	float	The $R^{2}$ score on the training data.
test_score_	float	The $R^{2}$ score on the test data.
ax	matplotlib Axes	The axes containing the residuals scatter plot with optional histogram or Q-Q plot.

Usage Examples

Basic Usage

from sklearn.linear_model import Ridge
from sklearn.model_selection import train_test_split
from yellowbrick.regressor import ResidualsPlot
from yellowbrick.datasets import load_concrete

# Load dataset
X, y = load_concrete()
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create and fit the visualizer
viz = ResidualsPlot(Ridge())
viz.fit(X_train, y_train)
viz.score(X_test, y_test)
viz.show()

Quick Method

from sklearn.linear_model import Ridge
from sklearn.model_selection import train_test_split
from yellowbrick.regressor import residuals_plot
from yellowbrick.datasets import load_concrete

X, y = load_concrete()
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

viz = residuals_plot(Ridge(), X_train, y_train, X_test, y_test)

Related Pages

Implements Principle

Principle:DistrictDataLabs_Yellowbrick_Residual_Analysis

Requires Environment

Environment:DistrictDataLabs_Yellowbrick_Python_Scikit_Learn_Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment