Heuristic:Scikit learn Scikit learn Convergence Warning Handling
| Knowledge Sources | |
|---|---|
| Domains | Optimization, Debugging |
| Last Updated | 2026-02-08 15:00 GMT |
Overview
Diagnostic patterns for handling ConvergenceWarning, FitFailedWarning, and UndefinedMetricWarning during model training and evaluation.
Description
Scikit-learn uses a structured warning system to communicate non-fatal issues during model training and evaluation. The three most important warnings are: ConvergenceWarning (solver did not converge within max_iter), FitFailedWarning (a fit call failed during cross-validation), and UndefinedMetricWarning (metric calculation encountered zero division). Each has specific diagnostic actions and solutions. Ignoring these warnings can lead to silently degraded model performance.
Usage
Apply this heuristic when LogisticRegression_Fit or BaseSearchCV_Fit emits warnings during execution. It is particularly important during Grid_Search (hyperparameter tuning) where some parameter combinations may not converge, and during Metric_Evaluation where class imbalance can trigger undefined metrics.
The Insight (Rule of Thumb)
- Action (ConvergenceWarning): Increase `max_iter` parameter, scale features with StandardScaler, or try a different solver.
- Action (FitFailedWarning): Set `error_score='raise'` in GridSearchCV to see full traceback instead of silently continuing.
- Action (UndefinedMetricWarning): Check for empty classes in test folds; use `zero_division` parameter to control behavior.
- Value: Default `max_iter=100` for LogisticRegression is often insufficient for unscaled data; typical fix is `max_iter=1000` or feature scaling.
- Trade-off: Increasing `max_iter` costs computation time but ensures convergence.
Reasoning
ConvergenceWarning is the most commonly encountered warning in scikit-learn. It typically indicates that the optimization algorithm ran out of iterations before finding a minimum. This can happen because: (1) features are not scaled, causing the loss landscape to be poorly conditioned, (2) the regularization parameter `C` is too large, requiring more iterations, or (3) the solver is inappropriate for the problem structure. The recommended first step is always to scale features using StandardScaler, which dramatically reduces the number of iterations needed.
FitFailedWarning in cross-validation contexts means some train/test fold combinations caused errors. By default, GridSearchCV sets `error_score=np.nan` and continues, which can mask important errors.
Code Evidence
ConvergenceWarning definition from `sklearn/exceptions.py:68-73`:
class ConvergenceWarning(UserWarning):
"""Custom warning to capture convergence problems.
.. versionchanged:: 0.18
Moved from sklearn.utils.
"""
FitFailedWarning usage from `sklearn/model_selection/_validation.py:490`:
# FitFailedWarning raised when error_score is a numeric value
# in cross-validation fit failures
UndefinedMetricWarning in classification metrics from `sklearn/metrics/_classification.py:1010-1014`:
# Check denominator == 0 exactly (safe for sum of positive terms)
# Uses UndefinedMetricWarning with stacklevel=2
# Customizable via `replace_undefined_by` parameter
Non-finite score warning from `sklearn/model_selection/_search.py:1137-1140`:
# Warns if train/test scores are non-finite (NaN/inf)
# during grid search or randomized search
Penalty scaling note from `sklearn/linear_model/_logistic.py:309-320`:
# All solvers relying on LinearModelLoss need to scale penalty with n_samples
# because the objective is:
# C * sum(pointwise_loss) + penalty
# NOT:
# mean(pointwise_loss) + 1/C * penalty
sw_sum = n_samples # if sample_weight is None
Related Pages
- Implementation:Scikit_learn_Scikit_learn_LogisticRegression_Fit
- Implementation:Scikit_learn_Scikit_learn_BaseSearchCV_Fit
- Implementation:Scikit_learn_Scikit_learn_Accuracy_Score
- Implementation:Scikit_learn_Scikit_learn_Cross_Validate
- Principle:Scikit_learn_Scikit_learn_Model_Training
- Principle:Scikit_learn_Scikit_learn_Metric_Evaluation