Implementation:Scikit learn Scikit learn PairwiseDistancesReduction
| Knowledge Sources | |
|---|---|
| Domains | Machine Learning, Distance Computation |
| Last Updated | 2026-02-08 15:00 GMT |
Overview
Concrete tool for optimized pairwise distance computation and reduction dispatching provided by scikit-learn.
Description
The pairwise distances reduction dispatcher module provides high-performance implementations for computing pairwise distances with simultaneous reduction operations. It includes the BaseDistancesReductionDispatcher abstract base class and concrete dispatchers: ArgKmin (finds k nearest neighbors), RadiusNeighbors (finds neighbors within a radius), ArgKminClassMode (k-nearest neighbors with class-based voting), and RadiusNeighborsClassMode (radius neighbors with class-based voting). These dispatchers automatically select between float32 and float64 implementations and validate whether optimized computation paths are available.
Usage
Use these dispatchers internally within scikit-learn's nearest neighbor algorithms for efficient distance computation and reduction. They are the low-level building blocks that power KNeighborsClassifier, KNeighborsRegressor, RadiusNeighborsClassifier, and similar estimators.
Code Reference
Source Location
- Repository: scikit-learn
- File: sklearn/metrics/_pairwise_distances_reduction/_dispatcher.py
Signature
def sqeuclidean_row_norms(X, num_threads)
class BaseDistancesReductionDispatcher:
@classmethod
def valid_metrics(cls) -> List[str]
@classmethod
def is_usable_for(cls, X, Y, metric) -> bool
@classmethod
def compute(cls, X, Y, ...)
class ArgKmin(BaseDistancesReductionDispatcher):
@classmethod
def compute(cls, X, Y, k, metric="euclidean", chunk_size=None,
metric_kwargs=None, strategy=None, return_distance=False)
class RadiusNeighbors(BaseDistancesReductionDispatcher):
@classmethod
def compute(cls, X, Y, radius, metric="euclidean", chunk_size=None,
metric_kwargs=None, strategy=None, return_distance=False,
sort_results=False)
class ArgKminClassMode(BaseDistancesReductionDispatcher):
@classmethod
def compute(cls, X, Y, k, weights, class_membership, unique_labels, metric="euclidean",
chunk_size=None, metric_kwargs=None, strategy=None)
class RadiusNeighborsClassMode(BaseDistancesReductionDispatcher):
@classmethod
def compute(cls, X, Y, radius, weights, class_membership, unique_labels,
metric="euclidean", chunk_size=None, metric_kwargs=None,
strategy=None, outlier_label=None)
Import
from sklearn.metrics._pairwise_distances_reduction import ArgKmin
from sklearn.metrics._pairwise_distances_reduction import RadiusNeighbors
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| X | ndarray or CSR matrix of shape (n_samples_X, n_features) | Yes | First input data array (must be C-contiguous or CSR sparse) |
| Y | ndarray or CSR matrix of shape (n_samples_Y, n_features) | Yes | Second input data array |
| k | int | Yes (ArgKmin) | Number of nearest neighbors to find |
| radius | float | Yes (RadiusNeighbors) | Radius for neighbor search |
| metric | str | No | Distance metric (default euclidean) |
| chunk_size | int | No | Chunk size for parallel computation |
| strategy | str | No | Computation strategy (auto, parallel_on_X, parallel_on_Y) |
| weights | str | Yes (ClassMode) | Weight function for class voting |
| class_membership | ndarray | Yes (ClassMode) | Class labels for Y samples |
Outputs
| Name | Type | Description |
|---|---|---|
| argkmin_indices | ndarray of shape (n_samples_X, k) | Indices of k nearest neighbors in Y |
| argkmin_distances | ndarray of shape (n_samples_X, k) | Distances to k nearest neighbors (if return_distance=True) |
| radius_neighbors | list of ndarrays | Indices of neighbors within radius for each sample |
| class_predictions | ndarray | Predicted class labels for classification modes |
Usage Examples
Basic Usage
import numpy as np
from sklearn.metrics._pairwise_distances_reduction import ArgKmin
# Check if the optimized path is usable
X = np.random.rand(100, 10).astype(np.float64)
Y = np.random.rand(50, 10).astype(np.float64)
if ArgKmin.is_usable_for(X, Y, metric="euclidean"):
# Compute 5 nearest neighbors
result = ArgKmin.compute(X, Y, k=5, metric="euclidean", return_distance=True)
# List supported metrics
supported = ArgKmin.valid_metrics()
print(f"Supported metrics: {supported}")