Implementation:Scikit learn Scikit learn RandomProjection

Knowledge Sources	Scikit_learn Scikit-learn Docs
Domains	Dimensionality Reduction, Random Projection
Last Updated	2026-02-08 15:00 GMT

Overview

Concrete tool for dimensionality reduction through random projection provided by scikit-learn.

Description

The random_projection module provides transformers for reducing dimensionality of data using random projection, which trades a controlled amount of accuracy for faster processing times and smaller model sizes. GaussianRandomProjection uses a random matrix drawn from a Gaussian distribution, while SparseRandomProjection uses a sparse random matrix that is more memory-efficient. Both are grounded in the Johnson-Lindenstrauss lemma, which guarantees that pairwise distances are approximately preserved. The module also provides johnson_lindenstrauss_min_dim to compute the minimum safe number of components.

Usage

Use random projection when you need fast, computationally efficient dimensionality reduction that approximately preserves pairwise distances. It is particularly useful for high-dimensional data where PCA would be too expensive, and as a preprocessing step before applying algorithms sensitive to the curse of dimensionality. SparseRandomProjection is preferred for very large datasets due to its memory efficiency.

Code Reference

Source Location

Repository: scikit-learn
File: sklearn/random_projection.py

Signature

class GaussianRandomProjection(BaseRandomProjection):
    def __init__(
        self,
        n_components="auto",
        *,
        eps=0.1,
        compute_inverse_components=False,
        random_state=None,
    ):

class SparseRandomProjection(BaseRandomProjection):
    def __init__(
        self,
        n_components="auto",
        *,
        density="auto",
        eps=0.1,
        dense_output=False,
        compute_inverse_components=False,
        random_state=None,
    ):

def johnson_lindenstrauss_min_dim(n_samples, *, eps=0.1):

Import

from sklearn.random_projection import GaussianRandomProjection
from sklearn.random_projection import SparseRandomProjection
from sklearn.random_projection import johnson_lindenstrauss_min_dim

I/O Contract

Inputs

Name	Type	Required	Description
n_components	int or "auto"	No	Dimensionality of the target projection space. "auto" computes from eps and n_samples using the Johnson-Lindenstrauss lemma. Default is "auto".
eps	float	No	Maximum distortion rate as defined by the Johnson-Lindenstrauss lemma. Used when n_components is "auto". Default is 0.1.
compute_inverse_components	bool	No	Whether to compute the pseudo-inverse of the components for inverse_transform. Default is False.
random_state	int or RandomState	No	Controls the random number generator for reproducibility.
density	float or "auto"	No	Ratio of non-zero components in the random projection matrix (SparseRandomProjection only). Default is "auto" (1/sqrt(n_features)).
dense_output	bool	No	Force output to be a dense array (SparseRandomProjection only). Default is False.

Outputs

Name	Type	Description
X_transformed	ndarray or sparse matrix of shape (n_samples, n_components)	The projected data in the lower-dimensional space.
components_	ndarray or sparse matrix of shape (n_components, n_features)	The random projection matrix.
n_components_	int	The concrete number of components computed when n_components is "auto".

Usage Examples

Basic Usage

from sklearn.random_projection import GaussianRandomProjection, SparseRandomProjection
from sklearn.random_projection import johnson_lindenstrauss_min_dim
import numpy as np

# Compute minimum dimensions needed
min_dim = johnson_lindenstrauss_min_dim(n_samples=1000, eps=0.1)
print(f"Minimum components for eps=0.1: {min_dim}")

# Gaussian random projection
X = np.random.rand(100, 1000)
grp = GaussianRandomProjection(n_components=50, random_state=42)
X_projected = grp.fit_transform(X)
print(f"Original shape: {X.shape}, Projected shape: {X_projected.shape}")

# Sparse random projection
srp = SparseRandomProjection(n_components=50, random_state=42)
X_sparse_proj = srp.fit_transform(X)
print(f"Sparse projected shape: {X_sparse_proj.shape}")

Related Pages

Principle:Scikit_learn_Scikit_learn_Dimensionality_Reduction

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment