Implementation:Scikit learn Scikit learn NearestCentroid

Knowledge Sources	Scikit_learn Scikit-learn Docs
Domains	Machine Learning, Classification
Last Updated	2026-02-08 15:00 GMT

Overview

Concrete tool for nearest centroid classification provided by scikit-learn.

Description

NearestCentroid is a simple classification algorithm where each class is represented by its centroid (mean or median of features). Test samples are classified to the class with the nearest centroid. It supports Euclidean and Manhattan distance metrics, optional centroid shrinkage to remove features, and configurable class priors. The classifier is computationally efficient and has no hyperparameters to tune beyond the optional shrink threshold.

Usage

Use NearestCentroid when you need a fast, simple classifier with minimal tuning requirements. It works well when classes are well-separated and approximately convex, and is particularly effective for high-dimensional text classification with TF-IDF features.

Code Reference

Source Location

Repository: scikit-learn
File: sklearn/neighbors/_nearest_centroid.py

Signature

class NearestCentroid(
    DiscriminantAnalysisPredictionMixin, ClassifierMixin, BaseEstimator
):
    def __init__(
        self,
        metric="euclidean",
        *,
        shrink_threshold=None,
        priors="uniform",
    ):

Import

from sklearn.neighbors import NearestCentroid

I/O Contract

Inputs

Name	Type	Required	Description
metric	str	No	Distance metric: 'euclidean' or 'manhattan' (default='euclidean')
shrink_threshold	float or None	No	Threshold for shrinking centroids to remove features (default=None)
priors	str or array-like	No	Class prior probabilities: 'uniform', 'empirical', or array of shape (n_classes,) (default='uniform')

Outputs

Name	Type	Description
centroids_	ndarray of shape (n_classes, n_features)	Centroid of each class
classes_	ndarray of shape (n_classes,)	The unique class labels
n_features_in_	int	Number of features seen during fit
feature_names_in_	ndarray of shape (n_features_in_,)	Names of features seen during fit

Usage Examples

Basic Usage

from sklearn.neighbors import NearestCentroid
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42)

clf = NearestCentroid(metric="euclidean")
clf.fit(X_train, y_train)
print(f"Accuracy: {clf.score(X_test, y_test):.3f}")
print(f"Centroids shape: {clf.centroids_.shape}")

Related Pages

Principle:Scikit_learn_Scikit_learn_Nearest_Neighbors

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment