Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Scikit learn Scikit learn NeighborhoodComponentsAnalysis

From Leeroopedia


Knowledge Sources
Domains Machine Learning, Metric Learning
Last Updated 2026-02-08 15:00 GMT

Overview

Concrete tool for supervised metric learning via Neighborhood Components Analysis provided by scikit-learn.

Description

NeighborhoodComponentsAnalysis (NCA) is a supervised dimensionality reduction and metric learning algorithm. It learns a linear transformation that maximizes the expected leave-one-out classification accuracy of a stochastic nearest neighbors rule in the transformed space. The optimization is performed using scipy's L-BFGS-B solver, and supports multiple initialization strategies including PCA, LDA, identity, and random projections.

Usage

Use NeighborhoodComponentsAnalysis when you want to learn a distance metric or projection that improves nearest neighbor classification performance. It is particularly useful as a preprocessing step before applying k-nearest neighbors classifiers.

Code Reference

Source Location

Signature

class NeighborhoodComponentsAnalysis(
    ClassNamePrefixFeaturesOutMixin, TransformerMixin, BaseEstimator
):
    def __init__(
        self,
        n_components=None,
        *,
        init="auto",
        warm_start=False,
        max_iter=50,
        tol=1e-5,
        callback=None,
        verbose=0,
        random_state=None,
    ):

Import

from sklearn.neighbors import NeighborhoodComponentsAnalysis

I/O Contract

Inputs

Name Type Required Description
n_components int or None No Preferred dimensionality of the projected space (default=None, uses n_features)
init str or ndarray No Initialization method: 'auto', 'pca', 'lda', 'identity', 'random', or ndarray (default='auto')
warm_start bool No Whether to reuse solution of previous fit as initialization (default=False)
max_iter int No Maximum number of iterations in the optimization (default=50)
tol float No Convergence tolerance for the optimization (default=1e-5)
callback callable or None No Called after each iteration of the optimizer
verbose int No Verbosity level (default=0)
random_state int, RandomState, or None No Random state for reproducibility

Outputs

Name Type Description
components_ ndarray of shape (n_components, n_features) The learned linear transformation
n_features_in_ int Number of features seen during fit
n_iter_ int Number of iterations run by the optimizer
random_state_ RandomState Pseudo random number generator object used during fitting

Usage Examples

Basic Usage

from sklearn.neighbors import NeighborhoodComponentsAnalysis, KNeighborsClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.pipeline import Pipeline

X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42)

nca = NeighborhoodComponentsAnalysis(n_components=2, random_state=42)
knn = KNeighborsClassifier(n_neighbors=3)

pipe = Pipeline([("nca", nca), ("knn", knn)])
pipe.fit(X_train, y_train)
print(pipe.score(X_test, y_test))

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment