Heuristic:Scikit learn Scikit learn Convergence Warning Handling

Knowledge Sources	scikit-learn sklearn.exceptions
Domains	Optimization, Debugging
Last Updated	2026-02-08 15:00 GMT

Overview

Diagnostic patterns for handling ConvergenceWarning, FitFailedWarning, and UndefinedMetricWarning during model training and evaluation.

Description

Scikit-learn uses a structured warning system to communicate non-fatal issues during model training and evaluation. The three most important warnings are: ConvergenceWarning (solver did not converge within max_iter), FitFailedWarning (a fit call failed during cross-validation), and UndefinedMetricWarning (metric calculation encountered zero division). Each has specific diagnostic actions and solutions. Ignoring these warnings can lead to silently degraded model performance.

Usage

Apply this heuristic when LogisticRegression_Fit or BaseSearchCV_Fit emits warnings during execution. It is particularly important during Grid_Search (hyperparameter tuning) where some parameter combinations may not converge, and during Metric_Evaluation where class imbalance can trigger undefined metrics.

The Insight (Rule of Thumb)

Action (ConvergenceWarning): Increase `max_iter` parameter, scale features with StandardScaler, or try a different solver.
Action (FitFailedWarning): Set `error_score='raise'` in GridSearchCV to see full traceback instead of silently continuing.
Action (UndefinedMetricWarning): Check for empty classes in test folds; use `zero_division` parameter to control behavior.
Value: Default `max_iter=100` for LogisticRegression is often insufficient for unscaled data; typical fix is `max_iter=1000` or feature scaling.
Trade-off: Increasing `max_iter` costs computation time but ensures convergence.

Reasoning

ConvergenceWarning is the most commonly encountered warning in scikit-learn. It typically indicates that the optimization algorithm ran out of iterations before finding a minimum. This can happen because: (1) features are not scaled, causing the loss landscape to be poorly conditioned, (2) the regularization parameter `C` is too large, requiring more iterations, or (3) the solver is inappropriate for the problem structure. The recommended first step is always to scale features using StandardScaler, which dramatically reduces the number of iterations needed.

FitFailedWarning in cross-validation contexts means some train/test fold combinations caused errors. By default, GridSearchCV sets `error_score=np.nan` and continues, which can mask important errors.

Code Evidence

ConvergenceWarning definition from `sklearn/exceptions.py:68-73`:

class ConvergenceWarning(UserWarning):
    """Custom warning to capture convergence problems.

    .. versionchanged:: 0.18
       Moved from sklearn.utils.
    """

FitFailedWarning usage from `sklearn/model_selection/_validation.py:490`:

# FitFailedWarning raised when error_score is a numeric value
# in cross-validation fit failures

UndefinedMetricWarning in classification metrics from `sklearn/metrics/_classification.py:1010-1014`:

# Check denominator == 0 exactly (safe for sum of positive terms)
# Uses UndefinedMetricWarning with stacklevel=2
# Customizable via `replace_undefined_by` parameter

Non-finite score warning from `sklearn/model_selection/_search.py:1137-1140`:

# Warns if train/test scores are non-finite (NaN/inf)
# during grid search or randomized search

Penalty scaling note from `sklearn/linear_model/_logistic.py:309-320`:

# All solvers relying on LinearModelLoss need to scale penalty with n_samples
# because the objective is:
#     C * sum(pointwise_loss) + penalty
# NOT:
#     mean(pointwise_loss) + 1/C * penalty
sw_sum = n_samples  # if sample_weight is None

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment