Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Scikit learn Scikit learn NearestCentroid

From Leeroopedia


Knowledge Sources
Domains Machine Learning, Classification
Last Updated 2026-02-08 15:00 GMT

Overview

Concrete tool for nearest centroid classification provided by scikit-learn.

Description

NearestCentroid is a simple classification algorithm where each class is represented by its centroid (mean or median of features). Test samples are classified to the class with the nearest centroid. It supports Euclidean and Manhattan distance metrics, optional centroid shrinkage to remove features, and configurable class priors. The classifier is computationally efficient and has no hyperparameters to tune beyond the optional shrink threshold.

Usage

Use NearestCentroid when you need a fast, simple classifier with minimal tuning requirements. It works well when classes are well-separated and approximately convex, and is particularly effective for high-dimensional text classification with TF-IDF features.

Code Reference

Source Location

Signature

class NearestCentroid(
    DiscriminantAnalysisPredictionMixin, ClassifierMixin, BaseEstimator
):
    def __init__(
        self,
        metric="euclidean",
        *,
        shrink_threshold=None,
        priors="uniform",
    ):

Import

from sklearn.neighbors import NearestCentroid

I/O Contract

Inputs

Name Type Required Description
metric str No Distance metric: 'euclidean' or 'manhattan' (default='euclidean')
shrink_threshold float or None No Threshold for shrinking centroids to remove features (default=None)
priors str or array-like No Class prior probabilities: 'uniform', 'empirical', or array of shape (n_classes,) (default='uniform')

Outputs

Name Type Description
centroids_ ndarray of shape (n_classes, n_features) Centroid of each class
classes_ ndarray of shape (n_classes,) The unique class labels
n_features_in_ int Number of features seen during fit
feature_names_in_ ndarray of shape (n_features_in_,) Names of features seen during fit

Usage Examples

Basic Usage

from sklearn.neighbors import NearestCentroid
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42)

clf = NearestCentroid(metric="euclidean")
clf.fit(X_train, y_train)
print(f"Accuracy: {clf.score(X_test, y_test):.3f}")
print(f"Centroids shape: {clf.centroids_.shape}")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment