Implementation:Scikit learn Scikit learn SpectralBiclustering
| Knowledge Sources | |
|---|---|
| Domains | Biclustering, Unsupervised Learning |
| Last Updated | 2026-02-08 15:00 GMT |
Overview
Concrete tool for performing spectral biclustering on data matrices provided by scikit-learn.
Description
SpectralBiclustering partitions rows and columns of a data matrix simultaneously under the assumption that the data has an underlying checkerboard structure. It uses singular value decomposition (SVD) to find the best low-rank approximation of the normalized data matrix, then applies K-Means clustering to the resulting singular vectors. The algorithm supports multiple normalization methods including bistochastic, scale, and log normalization (Kluger, 2003).
Usage
Use SpectralBiclustering when your data matrix has an inherent checkerboard structure, such as gene expression data where groups of genes respond similarly across groups of conditions. It is suitable when you want to simultaneously cluster both rows and columns rather than just one dimension.
Code Reference
Source Location
- Repository: scikit-learn
- File: sklearn/cluster/_bicluster.py
Signature
class SpectralBiclustering(BaseSpectral):
def __init__(
self,
n_clusters=3,
*,
method="bistochastic",
n_components=6,
n_best=3,
svd_method="randomized",
n_svd_vecs=None,
mini_batch=False,
init="k-means++",
n_init=10,
random_state=None,
):
Import
from sklearn.cluster import SpectralBiclustering
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| n_clusters | int or tuple | No | Number of row and column clusters in the checkerboard structure. Default is 3. |
| method | str | No | Normalization method: "bistochastic", "scale", or "log". Default is "bistochastic". |
| n_components | int | No | Number of singular vectors to check. Default is 6. |
| n_best | int | No | Number of best singular vectors to use for clustering. Default is 3. |
| svd_method | str | No | SVD method: "randomized" or "arpack". Default is "randomized". |
| n_svd_vecs | int or None | No | Number of vectors for SVD computation. Default is None. |
| mini_batch | bool | No | Whether to use MiniBatchKMeans instead of KMeans. Default is False. |
| init | str | No | KMeans initialization method. Default is "k-means++". |
| n_init | int | No | Number of KMeans initializations. Default is 10. |
| random_state | int or RandomState | No | Random state for reproducibility. Default is None. |
Outputs
| Name | Type | Description |
|---|---|---|
| rows_ | ndarray of shape (n_row_clusters, n_rows) | Boolean array indicating row membership in each bicluster. |
| columns_ | ndarray of shape (n_column_clusters, n_columns) | Boolean array indicating column membership in each bicluster. |
| row_labels_ | ndarray of shape (n_rows,) | Row cluster labels. |
| column_labels_ | ndarray of shape (n_columns,) | Column cluster labels. |
Usage Examples
Basic Usage
from sklearn.cluster import SpectralBiclustering
from sklearn.datasets import make_checkerboard
import numpy as np
data, rows, columns = make_checkerboard(
shape=(300, 300), n_clusters=(4, 3), noise=10, random_state=0
)
model = SpectralBiclustering(n_clusters=3, random_state=0)
model.fit(data)
print(model.row_labels_)
print(model.column_labels_)