Implementation:Scikit learn Scikit learn HuberRegressor
| Knowledge Sources | |
|---|---|
| Domains | Machine Learning, Robust Regression |
| Last Updated | 2026-02-08 15:00 GMT |
Overview
Concrete tool for L2-regularized linear regression that is robust to outliers using the Huber loss function provided by scikit-learn.
Description
HuberRegressor optimizes the squared loss for samples where |(y - Xw - c) / sigma| < epsilon and the absolute loss for samples where |(y - Xw - c) / sigma| > epsilon. The model parameters include coefficients w, intercept c, and scale sigma. The epsilon parameter controls the threshold between inlier (squared loss) and outlier (absolute loss) treatment, providing robustness to outliers while not completely ignoring their effect. The optimization is performed using scipy's L-BFGS-B solver.
Usage
Use HuberRegressor when your dataset contains outliers that would disproportionately affect ordinary least squares regression. It is particularly useful when you want a compromise between fully robust methods (like RANSAC) and standard least squares, maintaining sensitivity to most data while limiting the influence of outliers.
Code Reference
Source Location
- Repository: scikit-learn
- File: sklearn/linear_model/_huber.py
Signature
class HuberRegressor(LinearModel, RegressorMixin, BaseEstimator):
def __init__(
self,
*,
epsilon=1.35,
max_iter=100,
alpha=0.0001,
warm_start=False,
fit_intercept=True,
tol=1e-05,
):
Import
from sklearn.linear_model import HuberRegressor
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| epsilon | float | No | Controls outlier classification threshold; must be >= 1 (default=1.35) |
| max_iter | int | No | Maximum number of L-BFGS-B iterations (default=100) |
| alpha | float | No | L2 regularization strength (default=0.0001) |
| warm_start | bool | No | Reuse previous solution as initialization (default=False) |
| fit_intercept | bool | No | Whether to fit the intercept (default=True) |
| tol | float | No | Convergence tolerance for the projected gradient (default=1e-05) |
Outputs
| Name | Type | Description |
|---|---|---|
| coef_ | ndarray of shape (n_features,) | Feature coefficients |
| intercept_ | float | Bias term in the linear model |
| scale_ | float | Estimated scale parameter (sigma) of the Huber function |
| outliers_ | ndarray of shape (n_samples,) | Boolean mask indicating samples classified as outliers |
| n_iter_ | int | Number of iterations performed by the optimizer |
Usage Examples
Basic Usage
from sklearn.linear_model import HuberRegressor
from sklearn.datasets import make_regression
import numpy as np
X, y = make_regression(n_samples=100, n_features=5, noise=10, random_state=42)
# Add outliers
y[0] = 1000
y[1] = -1000
model = HuberRegressor(epsilon=1.35)
model.fit(X, y)
print("Number of outliers:", model.outliers_.sum())
print("Coefficients:", model.coef_)