Principle:Scikit learn Scikit learn Classification Prediction
| Field | Value |
|---|---|
| sources | Bishop, C.M. (2006). Pattern Recognition and Machine Learning, Springer; Hastie, T., Tibshirani, R., and Friedman, J. (2009). The Elements of Statistical Learning, 2nd ed., Springer |
| domains | Machine_Learning, Statistics |
| last_updated | 2026-02-08 15:00 GMT |
Overview
A function mapping that transforms input features into discrete class labels using a trained model.
Description
Classification prediction is the process of applying a trained model to new, unseen feature vectors to produce discrete class label assignments. Once a classifier has been fitted (i.e., its parameters have been estimated from training data), prediction is a purely deterministic computation that maps each input sample to one of the known classes.
For linear classifiers, prediction involves two stages:
- Computing a decision function -- A linear combination of the input features and the learned weights produces a raw score (or scores, in the multiclass case) for each sample.
- Applying a decision rule -- The raw scores are converted into class labels. In binary classification, a threshold (typically zero) determines the class. In multiclass classification, the class with the highest score (argmax) is selected.
In scikit-learn, the predict(X) method encapsulates both stages and returns an array of class labels. The separate decision_function(X) method provides access to the raw scores, and predict_proba(X) (where available) returns calibrated probability estimates.
Usage
Use classification prediction when:
- Generating predictions on test data -- After training, call
predict(X_test)to obtain class labels for evaluation. - Deploying a model in production -- The predict method is the primary inference interface for serving predictions to downstream systems.
- Analyzing decision boundaries -- Use
decision_functionto understand the model's confidence and visualize how the feature space is partitioned.
Theoretical Basis
Decision Function
For a linear classifier with weight matrix (shape: n_classes x n_features) and intercept vector (shape: n_classes), the decision function for a sample is:
This produces a score vector of length n_classes (or a single scalar in the binary case, where only one set of weights is stored).
Threshold-Based Classification
Binary case: A single score is computed. The predicted class is:
Multiclass case: A score vector is computed and the predicted class is the one with the highest score:
Class Label Mapping
The predicted index is mapped back to the original class label through the classes_ attribute, which stores the unique sorted class labels encountered during training. This ensures that predictions are returned in the same label space as the original training targets, regardless of whether those labels are integers, strings, or other types.