Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Scikit learn Scikit learn Linear Regression

From Leeroopedia


Knowledge Sources
Domains Supervised Learning, Regression
Last Updated 2026-02-08 15:00 GMT

Overview

Linear regression models the relationship between a dependent variable and one or more independent variables by fitting a linear equation to observed data.

Description

Linear regression is the foundational supervised learning technique for predicting a continuous target as a linear combination of input features. Regularized variants address overfitting and multicollinearity by adding penalty terms to the loss function. Ridge regression applies an 2 penalty, Lasso applies an 1 penalty (producing sparse solutions), and ElasticNet combines both. These methods form the backbone of predictive modeling and are often the first approach tried before more complex models.

Usage

Use ordinary linear regression when the relationship between features and target is approximately linear and the number of features is moderate relative to the sample size. Use Ridge when features are correlated (multicollinearity) and you want to shrink coefficients without eliminating them. Use Lasso when you suspect many features are irrelevant and want automatic feature selection via sparsity. Use ElasticNet when you need a balance between Ridge and Lasso, particularly when features are correlated and some should be zeroed out. Use LARS (Least Angle Regression) when you want an efficient path algorithm for Lasso-type problems.

Theoretical Basis

Ordinary Least Squares (OLS) minimizes the residual sum of squares:

β^=argminβyXβ22

The closed-form solution is β^=(XTX)1XTy.

Ridge Regression adds an 2 penalty:

β^ridge=argminβyXβ22+αβ22

The solution is β^ridge=(XTX+αI)1XTy. The regularization parameter α controls the trade-off between fit and coefficient magnitude.

Lasso Regression adds an 1 penalty:

β^lasso=argminβ12nyXβ22+αβ1

The 1 penalty induces sparsity, setting some coefficients exactly to zero, effectively performing feature selection.

ElasticNet combines both penalties:

β^enet=argminβ12nyXβ22+αρβ1+α(1ρ)2β22

where ρ[0,1] is the mixing ratio between 1 and 2.

LARS (Least Angle Regression) is an efficient algorithm that computes the full regularization path for Lasso. It proceeds by identifying the feature most correlated with the current residual, then moving the coefficient in the direction of that feature until another feature becomes equally correlated, at which point both are adjusted simultaneously.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment