Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Scikit learn Scikit learn Isotonic Regression

From Leeroopedia


Knowledge Sources
Domains Supervised Learning, Non-Parametric Methods
Last Updated 2026-02-08 15:00 GMT

Overview

Isotonic regression fits a non-decreasing (or non-increasing) piecewise constant function to data, providing a non-parametric regression method with a monotonicity constraint.

Description

Isotonic regression finds the best-fitting monotonic function to observed data without assuming a specific parametric form. It solves the problem of fitting a regression function when the only assumption is that the relationship between the predictor and the response is monotonic. The resulting fit is a step function that minimizes the sum of squared errors subject to the ordering constraint. Isotonic regression is widely used in probability calibration (transforming classifier outputs into calibrated probabilities), dose-response modeling, and any setting where domain knowledge dictates a monotonic relationship.

Usage

Use isotonic regression when the relationship between the predictor and the response is known or expected to be monotonic but the exact functional form is unknown. It is commonly used as a calibration method for classifiers, where predicted scores should have a monotonic relationship with true probabilities. It is also useful in medical dose-response studies, pricing models, and quality control. Note that isotonic regression is limited to univariate problems (one predictor variable) and can overfit when the dataset is small, as it has high flexibility with no smoothness constraint.

Theoretical Basis

Problem Formulation: Given data pairs (x1,y1),,(xn,yn) with x1x2xn, isotonic regression solves:

y^=argminzi=1nwi(yizi)2s.t.z1z2zn

where wi are optional sample weights.

Pool Adjacent Violators (PAV) Algorithm:

  1. Start with zi=yi.
  2. Scan from left to right. If zi>zi+1 (violation of monotonicity):
    1. Merge the two adjacent blocks by replacing both values with their weighted average.
    2. Check backward for further violations and merge as needed.
  3. Continue until no violations remain.

The PAV algorithm has time complexity O(n) and produces the exact global optimum.

Properties:

  • The solution is a piecewise constant (step) function.
  • It is the projection of the data onto the cone of monotone sequences in the weighted least-squares sense.
  • The number of steps (plateaus) in the solution is at most n and depends on the data.

For prediction at new points: Linear interpolation (or the step function value) is used between the fitted values at training points. Extrapolation beyond the range uses the boundary values.

Monotonicity direction: The constraint can be non-decreasing (zizi+1) or non-increasing (zizi+1), depending on the known direction of the relationship.

Application to calibration: When used for probability calibration, the predicted scores f(x) serve as the predictor and the true binary labels y serve as the response. The fitted isotonic function maps raw scores to calibrated probabilities while preserving the ranking.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment