Principle:DistrictDataLabs Yellowbrick Bestfit Regression Overlay
| Knowledge Sources | |
|---|---|
| Domains | Regression, Visualization |
| Last Updated | 2026-02-08 05:00 GMT |
Overview
Technique of overlaying regression trend lines and identity reference lines onto scatter plots to aid visual assessment of model fit and data relationships.
Description
Best fit line overlays add regression curves (linear, quadratic) to scatter plots, enabling quick visual assessment of relationships between variables. The identity line (y=x) provides a reference for prediction error plots where perfect predictions would fall on this line. Model selection between linear and quadratic fits can be automated via MSE comparison.
Usage
Use this principle when building visualizations that need to show trend lines or reference lines, such as residual plots and prediction error plots.
Theoretical Basis
Linear: via OLS
Quadratic: via OLS with polynomial features
Model Selection: Choose model with lowest MSE: