Implementation:Scikit learn Scikit learn FactorAnalysis
| Knowledge Sources | |
|---|---|
| Domains | Dimensionality Reduction, Latent Variable Models |
| Last Updated | 2026-02-08 15:00 GMT |
Overview
Concrete tool for factor analysis with heteroscedastic noise modeling provided by scikit-learn.
Description
FactorAnalysis is a simple linear generative model with Gaussian latent variables. The observations are assumed to be caused by a linear transformation of lower-dimensional latent factors with added Gaussian noise, where each feature has its own noise variance. This distinguishes it from probabilistic PCA, which assumes isotropic noise. FactorAnalysis performs maximum likelihood estimation of the loading matrix using an SVD-based approach.
Usage
Use FactorAnalysis when you need a latent variable model that accounts for different noise variances across features. It is well suited for exploratory factor analysis in social sciences, psychology, and any domain where you want to uncover latent structure while modeling feature-specific noise.
Code Reference
Source Location
- Repository: scikit-learn
- File: sklearn/decomposition/_factor_analysis.py
Signature
class FactorAnalysis(ClassNamePrefixFeaturesOutMixin, TransformerMixin, BaseEstimator):
def __init__(
self,
n_components=None,
*,
tol=1e-2,
copy=True,
max_iter=1000,
noise_variance_init=None,
svd_method="randomized",
iterated_power=3,
rotation=None,
random_state=0,
):
Import
from sklearn.decomposition import FactorAnalysis
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| n_components | int | No | Dimensionality of latent space. Defaults to n_features if None. |
| tol | float | No | Stopping tolerance for log-likelihood increase (default=1e-2). |
| copy | bool | No | Whether to make a copy of X (default=True). |
| max_iter | int | No | Maximum number of iterations (default=1000). |
| noise_variance_init | array-like | No | Initial guess of noise variance for each feature. |
| svd_method | str | No | SVD method: 'lapack' or 'randomized' (default='randomized'). |
| iterated_power | int | No | Number of iterations for power method in randomized SVD (default=3). |
| rotation | str | No | Factor rotation method: None, 'varimax', or 'quartimax'. |
| random_state | int or RandomState | No | Random state for reproducibility (default=0). |
Outputs
| Name | Type | Description |
|---|---|---|
| components_ | ndarray of shape (n_components, n_features) | The loading matrix representing the latent factors. |
| loglike_ | list of float | Log-likelihood at each iteration. |
| noise_variance_ | ndarray of shape (n_features,) | Estimated noise variance for each feature. |
| n_iter_ | int | Number of iterations run. |
| mean_ | ndarray of shape (n_features,) | Per-feature empirical mean. |
Usage Examples
Basic Usage
import numpy as np
from sklearn.decomposition import FactorAnalysis
X = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]], dtype=float)
fa = FactorAnalysis(n_components=2, random_state=0)
X_transformed = fa.fit_transform(X)
print(X_transformed.shape) # (4, 2)
print(fa.components_.shape) # (2, 3)
print(fa.noise_variance_)