Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Scikit learn Scikit learn RandomProjection

From Leeroopedia


Knowledge Sources
Domains Dimensionality Reduction, Random Projection
Last Updated 2026-02-08 15:00 GMT

Overview

Concrete tool for dimensionality reduction through random projection provided by scikit-learn.

Description

The random_projection module provides transformers for reducing dimensionality of data using random projection, which trades a controlled amount of accuracy for faster processing times and smaller model sizes. GaussianRandomProjection uses a random matrix drawn from a Gaussian distribution, while SparseRandomProjection uses a sparse random matrix that is more memory-efficient. Both are grounded in the Johnson-Lindenstrauss lemma, which guarantees that pairwise distances are approximately preserved. The module also provides johnson_lindenstrauss_min_dim to compute the minimum safe number of components.

Usage

Use random projection when you need fast, computationally efficient dimensionality reduction that approximately preserves pairwise distances. It is particularly useful for high-dimensional data where PCA would be too expensive, and as a preprocessing step before applying algorithms sensitive to the curse of dimensionality. SparseRandomProjection is preferred for very large datasets due to its memory efficiency.

Code Reference

Source Location

Signature

class GaussianRandomProjection(BaseRandomProjection):
    def __init__(
        self,
        n_components="auto",
        *,
        eps=0.1,
        compute_inverse_components=False,
        random_state=None,
    ):

class SparseRandomProjection(BaseRandomProjection):
    def __init__(
        self,
        n_components="auto",
        *,
        density="auto",
        eps=0.1,
        dense_output=False,
        compute_inverse_components=False,
        random_state=None,
    ):

def johnson_lindenstrauss_min_dim(n_samples, *, eps=0.1):

Import

from sklearn.random_projection import GaussianRandomProjection
from sklearn.random_projection import SparseRandomProjection
from sklearn.random_projection import johnson_lindenstrauss_min_dim

I/O Contract

Inputs

Name Type Required Description
n_components int or "auto" No Dimensionality of the target projection space. "auto" computes from eps and n_samples using the Johnson-Lindenstrauss lemma. Default is "auto".
eps float No Maximum distortion rate as defined by the Johnson-Lindenstrauss lemma. Used when n_components is "auto". Default is 0.1.
compute_inverse_components bool No Whether to compute the pseudo-inverse of the components for inverse_transform. Default is False.
random_state int or RandomState No Controls the random number generator for reproducibility.
density float or "auto" No Ratio of non-zero components in the random projection matrix (SparseRandomProjection only). Default is "auto" (1/sqrt(n_features)).
dense_output bool No Force output to be a dense array (SparseRandomProjection only). Default is False.

Outputs

Name Type Description
X_transformed ndarray or sparse matrix of shape (n_samples, n_components) The projected data in the lower-dimensional space.
components_ ndarray or sparse matrix of shape (n_components, n_features) The random projection matrix.
n_components_ int The concrete number of components computed when n_components is "auto".

Usage Examples

Basic Usage

from sklearn.random_projection import GaussianRandomProjection, SparseRandomProjection
from sklearn.random_projection import johnson_lindenstrauss_min_dim
import numpy as np

# Compute minimum dimensions needed
min_dim = johnson_lindenstrauss_min_dim(n_samples=1000, eps=0.1)
print(f"Minimum components for eps=0.1: {min_dim}")

# Gaussian random projection
X = np.random.rand(100, 1000)
grp = GaussianRandomProjection(n_components=50, random_state=42)
X_projected = grp.fit_transform(X)
print(f"Original shape: {X.shape}, Projected shape: {X_projected.shape}")

# Sparse random projection
srp = SparseRandomProjection(n_components=50, random_state=42)
X_sparse_proj = srp.fit_transform(X)
print(f"Sparse projected shape: {X_sparse_proj.shape}")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment