Implementation:Scikit learn Scikit learn SpectralClustering
| Knowledge Sources | |
|---|---|
| Domains | Clustering, Graph-Based Clustering |
| Last Updated | 2026-02-08 15:00 GMT |
Overview
Concrete tool for performing spectral clustering via projection of the normalized Laplacian provided by scikit-learn.
Description
SpectralClustering applies clustering to a projection of the normalized graph Laplacian. It constructs an affinity matrix (using a kernel function such as RBF or k-nearest neighbors connectivity), computes a spectral embedding from the graph Laplacian, and then applies K-Means or discretization to the embedding to find clusters. It is particularly effective when the cluster structure is highly non-convex, such as nested circles or complex manifold shapes, where centroid-based methods fail.
Usage
Use SpectralClustering when clusters have non-convex shapes, when data lies on complex manifolds, or when working with graph-structured data where you want to find normalized graph cuts. It is also useful when you have a precomputed affinity or adjacency matrix. Note that it does not scale well to very large datasets.
Code Reference
Source Location
- Repository: scikit-learn
- File: sklearn/cluster/_spectral.py
Signature
class SpectralClustering(ClusterMixin, BaseEstimator):
def __init__(
self,
n_clusters=8,
*,
eigen_solver=None,
n_components=None,
random_state=None,
n_init=10,
gamma=1.0,
affinity="rbf",
n_neighbors=10,
eigen_tol="auto",
assign_labels="kmeans",
degree=3,
coef0=1,
kernel_params=None,
n_jobs=None,
verbose=False,
):
Import
from sklearn.cluster import SpectralClustering
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| n_clusters | int | No | Dimension of projection subspace (number of clusters). Default is 8. |
| eigen_solver | str or None | No | Eigenvalue solver: "arpack", "lobpcg", or "amg". Default is None. |
| n_components | int or None | No | Number of eigenvectors for spectral embedding. Default is None (uses n_clusters). |
| random_state | int or RandomState | No | Random state for K-Means initialization and eigenvector computation. Default is None. |
| n_init | int | No | Number of K-Means initializations for label assignment. Default is 10. |
| gamma | float | No | Kernel coefficient for rbf, poly, sigmoid, laplacian, and chi2 kernels. Default is 1.0. |
| affinity | str or callable | No | Affinity construction method: "rbf", "nearest_neighbors", "precomputed", or a kernel name. Default is "rbf". |
| n_neighbors | int | No | Number of neighbors for nearest_neighbors affinity. Default is 10. |
| eigen_tol | float or "auto" | No | Stopping criterion for eigenvalue decomposition. Default is "auto". |
| assign_labels | str | No | Strategy for assigning labels: "kmeans", "discretize", or "cluster_qr". Default is "kmeans". |
| degree | float | No | Degree of the polynomial kernel. Default is 3. |
| coef0 | float | No | Zero coefficient for polynomial and sigmoid kernels. Default is 1. |
| kernel_params | dict or None | No | Parameters for custom kernel functions. Default is None. |
| n_jobs | int or None | No | Number of parallel jobs for nearest neighbors. Default is None. |
| verbose | bool | No | Verbosity mode. Default is False. |
Outputs
| Name | Type | Description |
|---|---|---|
| affinity_matrix_ | ndarray of shape (n_samples, n_samples) | Affinity matrix used for clustering. |
| labels_ | ndarray of shape (n_samples,) | Cluster labels for each sample. |
| n_features_in_ | int | Number of features seen during fit. |
Usage Examples
Basic Usage
from sklearn.cluster import SpectralClustering
import numpy as np
X = np.array([[1, 1], [2, 1], [1, 0],
[4, 7], [3, 5], [3, 6]])
clustering = SpectralClustering(
n_clusters=2, assign_labels="discretize", random_state=0
).fit(X)
print(clustering.labels_)