Implementation:Scikit learn Scikit learn NeighborhoodComponentsAnalysis

Knowledge Sources	Scikit_learn Scikit-learn Docs
Domains	Machine Learning, Metric Learning
Last Updated	2026-02-08 15:00 GMT

Overview

Concrete tool for supervised metric learning via Neighborhood Components Analysis provided by scikit-learn.

Description

NeighborhoodComponentsAnalysis (NCA) is a supervised dimensionality reduction and metric learning algorithm. It learns a linear transformation that maximizes the expected leave-one-out classification accuracy of a stochastic nearest neighbors rule in the transformed space. The optimization is performed using scipy's L-BFGS-B solver, and supports multiple initialization strategies including PCA, LDA, identity, and random projections.

Usage

Use NeighborhoodComponentsAnalysis when you want to learn a distance metric or projection that improves nearest neighbor classification performance. It is particularly useful as a preprocessing step before applying k-nearest neighbors classifiers.

Code Reference

Source Location

Repository: scikit-learn
File: sklearn/neighbors/_nca.py

Signature

class NeighborhoodComponentsAnalysis(
    ClassNamePrefixFeaturesOutMixin, TransformerMixin, BaseEstimator
):
    def __init__(
        self,
        n_components=None,
        *,
        init="auto",
        warm_start=False,
        max_iter=50,
        tol=1e-5,
        callback=None,
        verbose=0,
        random_state=None,
    ):

Import

from sklearn.neighbors import NeighborhoodComponentsAnalysis

I/O Contract

Inputs

Name	Type	Required	Description
n_components	int or None	No	Preferred dimensionality of the projected space (default=None, uses n_features)
init	str or ndarray	No	Initialization method: 'auto', 'pca', 'lda', 'identity', 'random', or ndarray (default='auto')
warm_start	bool	No	Whether to reuse solution of previous fit as initialization (default=False)
max_iter	int	No	Maximum number of iterations in the optimization (default=50)
tol	float	No	Convergence tolerance for the optimization (default=1e-5)
callback	callable or None	No	Called after each iteration of the optimizer
verbose	int	No	Verbosity level (default=0)
random_state	int, RandomState, or None	No	Random state for reproducibility

Outputs

Name	Type	Description
components_	ndarray of shape (n_components, n_features)	The learned linear transformation
n_features_in_	int	Number of features seen during fit
n_iter_	int	Number of iterations run by the optimizer
random_state_	RandomState	Pseudo random number generator object used during fitting

Usage Examples

Basic Usage

from sklearn.neighbors import NeighborhoodComponentsAnalysis, KNeighborsClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.pipeline import Pipeline

X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42)

nca = NeighborhoodComponentsAnalysis(n_components=2, random_state=42)
knn = KNeighborsClassifier(n_neighbors=3)

pipe = Pipeline([("nca", nca), ("knn", knn)])
pipe.fit(X_train, y_train)
print(pipe.score(X_test, y_test))

Related Pages

Principle:Scikit_learn_Scikit_learn_Nearest_Neighbors

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment