Implementation:Scikit learn Scikit learn DummyClassifier
| Knowledge Sources | |
|---|---|
| Domains | Machine Learning, Model Evaluation |
| Last Updated | 2026-02-08 15:00 GMT |
Overview
Concrete tool for baseline dummy classifiers and regressors that make predictions ignoring input features provided by scikit-learn.
Description
DummyClassifier makes predictions that ignore the input features, serving as a simple baseline to compare against more complex classifiers. It supports multiple strategies: 'most_frequent' (always predict the most common class), 'prior' (predict based on class prior distribution), 'stratified' (random predictions matching class distribution), 'uniform' (random uniform predictions), and 'constant' (always predict a specified constant). The module also includes DummyRegressor which provides analogous baseline strategies for regression tasks.
Usage
Use DummyClassifier and DummyRegressor as baselines to verify that your actual models are performing better than simple heuristics. Any model that cannot beat the dummy baseline is not learning useful patterns from the data.
Code Reference
Source Location
- Repository: scikit-learn
- File: sklearn/dummy.py
Signature
class DummyClassifier(MultiOutputMixin, ClassifierMixin, BaseEstimator):
def __init__(self, *, strategy="prior", random_state=None, constant=None):
class DummyRegressor(MultiOutputMixin, RegressorMixin, BaseEstimator):
def __init__(self, *, strategy="mean", constant=None, quantile=None):
Import
from sklearn.dummy import DummyClassifier, DummyRegressor
I/O Contract
Inputs (DummyClassifier)
| Name | Type | Required | Description |
|---|---|---|---|
| strategy | str | No | Prediction strategy: 'most_frequent', 'prior', 'stratified', 'uniform', 'constant' (default='prior') |
| random_state | int, RandomState, or None | No | Random state for 'stratified' and 'uniform' strategies |
| constant | str, int, or array-like | No | Constant value to predict when strategy='constant' |
Inputs (DummyRegressor)
| Name | Type | Required | Description |
|---|---|---|---|
| strategy | str | No | Prediction strategy: 'mean', 'median', 'quantile', 'constant' (default='mean') |
| constant | float or array-like | No | Constant value to predict when strategy='constant' |
| quantile | float | No | Quantile to predict when strategy='quantile' (value in [0.0, 1.0]) |
Outputs
| Name | Type | Description |
|---|---|---|
| classes_ | ndarray of shape (n_classes,) | Unique class labels (DummyClassifier only) |
| n_classes_ | int or list | Number of classes for each output (DummyClassifier only) |
| class_prior_ | ndarray of shape (n_classes,) | Frequency of each class (DummyClassifier only) |
| n_outputs_ | int | Number of outputs |
| constant_ | ndarray | Mean, median, quantile, or constant value predicted (DummyRegressor only) |
| n_features_in_ | int | Number of features seen during fit |
Usage Examples
Basic Usage
from sklearn.dummy import DummyClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42)
dummy = DummyClassifier(strategy="most_frequent")
dummy.fit(X_train, y_train)
print(f"Baseline accuracy: {dummy.score(X_test, y_test):.3f}")