Implementation:Scikit learn Scikit learn NeighborhoodComponentsAnalysis
| Knowledge Sources | |
|---|---|
| Domains | Machine Learning, Metric Learning |
| Last Updated | 2026-02-08 15:00 GMT |
Overview
Concrete tool for supervised metric learning via Neighborhood Components Analysis provided by scikit-learn.
Description
NeighborhoodComponentsAnalysis (NCA) is a supervised dimensionality reduction and metric learning algorithm. It learns a linear transformation that maximizes the expected leave-one-out classification accuracy of a stochastic nearest neighbors rule in the transformed space. The optimization is performed using scipy's L-BFGS-B solver, and supports multiple initialization strategies including PCA, LDA, identity, and random projections.
Usage
Use NeighborhoodComponentsAnalysis when you want to learn a distance metric or projection that improves nearest neighbor classification performance. It is particularly useful as a preprocessing step before applying k-nearest neighbors classifiers.
Code Reference
Source Location
- Repository: scikit-learn
- File: sklearn/neighbors/_nca.py
Signature
class NeighborhoodComponentsAnalysis(
ClassNamePrefixFeaturesOutMixin, TransformerMixin, BaseEstimator
):
def __init__(
self,
n_components=None,
*,
init="auto",
warm_start=False,
max_iter=50,
tol=1e-5,
callback=None,
verbose=0,
random_state=None,
):
Import
from sklearn.neighbors import NeighborhoodComponentsAnalysis
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| n_components | int or None | No | Preferred dimensionality of the projected space (default=None, uses n_features) |
| init | str or ndarray | No | Initialization method: 'auto', 'pca', 'lda', 'identity', 'random', or ndarray (default='auto') |
| warm_start | bool | No | Whether to reuse solution of previous fit as initialization (default=False) |
| max_iter | int | No | Maximum number of iterations in the optimization (default=50) |
| tol | float | No | Convergence tolerance for the optimization (default=1e-5) |
| callback | callable or None | No | Called after each iteration of the optimizer |
| verbose | int | No | Verbosity level (default=0) |
| random_state | int, RandomState, or None | No | Random state for reproducibility |
Outputs
| Name | Type | Description |
|---|---|---|
| components_ | ndarray of shape (n_components, n_features) | The learned linear transformation |
| n_features_in_ | int | Number of features seen during fit |
| n_iter_ | int | Number of iterations run by the optimizer |
| random_state_ | RandomState | Pseudo random number generator object used during fitting |
Usage Examples
Basic Usage
from sklearn.neighbors import NeighborhoodComponentsAnalysis, KNeighborsClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.pipeline import Pipeline
X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42)
nca = NeighborhoodComponentsAnalysis(n_components=2, random_state=42)
knn = KNeighborsClassifier(n_neighbors=3)
pipe = Pipeline([("nca", nca), ("knn", knn)])
pipe.fit(X_train, y_train)
print(pipe.score(X_test, y_test))