Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Scikit learn Scikit learn PairwiseMetrics

From Leeroopedia


Knowledge Sources
Domains Machine Learning, Distance Computation
Last Updated 2026-02-08 15:00 GMT

Overview

Concrete tool for computing pairwise distances and kernel functions between sets of samples provided by scikit-learn.

Description

The pairwise metrics module provides efficient implementations for computing distances and kernel functions between pairs of samples. It includes distance functions (Euclidean, Manhattan, cosine, Haversine, NaN-aware Euclidean), kernel functions (linear, polynomial, RBF, sigmoid, Laplacian, chi-squared), and utility functions for batched/chunked distance computation. The module supports both dense and sparse input matrices and provides optimized parallel computation.

Usage

Use this module when computing distance matrices for nearest neighbor algorithms, kernel matrices for kernel-based methods (SVM, kernel PCA), or when you need efficient pairwise computations for large datasets with optional chunked processing to control memory usage.

Code Reference

Source Location

Signature

# Distance functions
def euclidean_distances(X, Y=None, *, Y_norm_squared=None, squared=False, X_norm_squared=None)
def nan_euclidean_distances(X, Y=None, *, squared=False, missing_values=np.nan, copy=True)
def cosine_distances(X, Y=None)
def manhattan_distances(X, Y=None)
def haversine_distances(X, Y=None)
def paired_euclidean_distances(X, Y)
def paired_manhattan_distances(X, Y)
def paired_cosine_distances(X, Y)
def paired_distances(X, Y, *, metric="euclidean", **kwds)

# Kernel functions
def linear_kernel(X, Y=None, dense_output=True)
def polynomial_kernel(X, Y=None, degree=3, gamma=None, coef0=1)
def sigmoid_kernel(X, Y=None, gamma=None, coef0=1)
def rbf_kernel(X, Y=None, gamma=None)
def laplacian_kernel(X, Y=None, gamma=None)
def cosine_similarity(X, Y=None, dense_output=True)
def additive_chi2_kernel(X, Y=None)
def chi2_kernel(X, Y=None, gamma=1.0)

# General-purpose functions
def pairwise_distances(X, Y=None, metric="euclidean", *, n_jobs=None, force_all_finite=True, **kwds)
def pairwise_distances_chunked(X, Y=None, *, reduce_func=None, metric="euclidean", n_jobs=None, working_memory=None, **kwds)
def pairwise_distances_argmin(X, Y, *, axis=1, metric="euclidean", metric_kwargs=None)
def pairwise_distances_argmin_min(X, Y, *, axis=1, metric="euclidean", metric_kwargs=None)
def pairwise_kernels(X, Y=None, metric="linear", *, filter_params=False, n_jobs=None, **kwds)

Import

from sklearn.metrics.pairwise import euclidean_distances, cosine_similarity
from sklearn.metrics.pairwise import rbf_kernel, pairwise_distances
from sklearn.metrics.pairwise import pairwise_kernels, pairwise_distances_chunked

I/O Contract

Inputs

Name Type Required Description
X array-like or sparse matrix of shape (n_samples_X, n_features) Yes First input array of samples
Y array-like or sparse matrix of shape (n_samples_Y, n_features) No Second input array (defaults to X if None)
metric str or callable No Distance metric or kernel name (default varies by function)
n_jobs int No Number of parallel jobs for computation
gamma float No Kernel coefficient for RBF, Laplacian, polynomial, and sigmoid kernels
degree int No Degree of polynomial kernel
coef0 float No Independent term in polynomial and sigmoid kernels
working_memory int No Maximum memory (in MB) for chunked distance computation

Outputs

Name Type Description
distances ndarray of shape (n_samples_X, n_samples_Y) Pairwise distance matrix
kernel_matrix ndarray of shape (n_samples_X, n_samples_Y) Pairwise kernel matrix
argmin ndarray of shape (n_samples_X,) Indices of nearest samples in Y for each sample in X

Usage Examples

Basic Usage

import numpy as np
from sklearn.metrics.pairwise import euclidean_distances, rbf_kernel, cosine_similarity

X = np.array([[0, 1], [1, 0], [2, 2]])
Y = np.array([[1, 1], [0, 0]])

# Compute Euclidean distance matrix
dist = euclidean_distances(X, Y)
print("Euclidean distances:\n", dist)

# Compute RBF kernel matrix
K = rbf_kernel(X, Y, gamma=0.5)
print("RBF kernel:\n", K)

# Compute cosine similarity
sim = cosine_similarity(X, Y)
print("Cosine similarity:\n", sim)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment