Implementation:Scikit learn Scikit learn ClassificationThreshold
| Knowledge Sources | |
|---|---|
| Domains | Machine Learning, Classification, Model Selection |
| Last Updated | 2026-02-08 15:00 GMT |
Overview
Concrete meta-estimator for tuning binary classification decision thresholds provided by scikit-learn.
Description
The _classification_threshold module provides BaseThresholdClassifier and related classes that optimize the decision threshold for binary classifiers. Instead of using the default 0.5 threshold, these meta-estimators search for the threshold that optimizes a given scoring metric (e.g., F1 score, balanced accuracy) via cross-validation. This is particularly useful for imbalanced datasets.
Usage
Use these meta-estimators when you need to optimize the decision threshold of a binary classifier for a specific metric, rather than relying on the default threshold. This is common in scenarios with class imbalance or asymmetric misclassification costs.
Code Reference
Source Location
- Repository: scikit-learn
- File: sklearn/model_selection/_classification_threshold.py
Signature
class BaseThresholdClassifier(ClassifierMixin, MetaEstimatorMixin, BaseEstimator):
_parameter_constraints: dict = {
"estimator": [...],
"response_method": [StrOptions({"auto", "predict_proba", "decision_function"})],
}
def __init__(self, estimator, *, response_method="auto"):
...
def fit(self, X, y, **params):
...
def predict(self, X):
...
class FixedThresholdClassifier(BaseThresholdClassifier):
def __init__(self, estimator, *, threshold="auto", response_method="auto"):
...
class TunedThresholdClassifierCV(BaseThresholdClassifier):
def __init__(self, estimator, *, scoring=None, response_method="auto",
thresholds=100, cv=None, refit=True, n_jobs=None, random_state=None,
store_cv_results=False):
...
Import
from sklearn.model_selection import FixedThresholdClassifier, TunedThresholdClassifierCV
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| estimator | estimator instance | Yes | A binary classifier with predict_proba or decision_function |
| X | array-like of shape (n_samples, n_features) | Yes | Training input samples |
| y | array-like of shape (n_samples,) | Yes | Binary target values |
| scoring | str or callable | No | Scoring metric to optimize the threshold for |
| cv | int or cross-validator | No | Cross-validation strategy for threshold tuning |
| threshold | float or "auto" | No | Fixed threshold value (for FixedThresholdClassifier) |
Outputs
| Name | Type | Description |
|---|---|---|
| predictions | ndarray of shape (n_samples,) | Class labels using the optimized threshold |
| best_threshold_ | float | The optimized decision threshold |
| best_score_ | float | The score achieved at the optimal threshold |
Usage Examples
Basic Usage
from sklearn.model_selection import TunedThresholdClassifierCV
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import make_classification
X, y = make_classification(n_samples=1000, weights=[0.9, 0.1], random_state=42)
clf = TunedThresholdClassifierCV(
LogisticRegression(),
scoring="f1",
cv=5,
)
clf.fit(X, y)
print(f"Best threshold: {clf.best_threshold_:.3f}")
predictions = clf.predict(X)