Principle:Scikit learn Scikit learn Cross Decomposition

Knowledge Sources	Scikit_learn Scikit-learn Docs
Domains	Supervised Learning, Dimensionality Reduction
Last Updated	2026-02-08 15:00 GMT

Overview

Cross decomposition methods find the fundamental relations between two multivariate datasets by projecting them into shared latent spaces that maximize their covariance or correlation.

Description

Cross decomposition techniques simultaneously decompose two data matrices $X$ and $Y$ to find latent components that capture the maximum covariance between them. Unlike standard regression, which predicts $Y$ from $X$ , cross decomposition finds joint latent structures shared by both datasets. These methods solve the problem of modeling relationships between two high-dimensional sets of variables, particularly when the variables within each set are highly correlated (multicollinear). Cross decomposition sits at the intersection of dimensionality reduction and regression, commonly applied in chemometrics, neuroimaging, and genomics.

Usage

Use Partial Least Squares (PLS) Regression when predicting a multivariate response $Y$ from a high-dimensional predictor $X$ and ordinary least squares fails due to multicollinearity or high dimensionality. PLS is the standard method in chemometrics for relating spectral measurements to chemical properties. Use Canonical Correlation Analysis (CCA) when the goal is to find maximally correlated linear combinations of two sets of variables, without a predictor-response distinction. CCA is common in neuroscience for relating brain activity patterns to behavioral measures.

Theoretical Basis

Partial Least Squares (PLS) Regression: PLS finds weight vectors $w_{k}$ and $c_{k}$ that maximize the covariance between the latent components:

$(w_{k}, c_{k}) = \arg \max_{w, c} cov (X w, Y c) = \arg \max_{w, c} w^{T} X^{T} Y c$

subject to $‖ w ‖ = 1$ , $‖ c ‖ = 1$ .

The PLS algorithm (NIPALS):

Compute weight vectors $w = X^{T} Y c / ‖ X^{T} Y c ‖$ and $c = Y^{T} X w / ‖ Y^{T} X w ‖$ iteratively.
Compute scores: $t = X w$ (X-scores), $u = Y c$ (Y-scores).
Compute loadings: $p = X^{T} t / (t^{T} t)$ , $q = Y^{T} t / (t^{T} t)$ .
Deflate: $X \leftarrow X - t p^{T}$ , $Y \leftarrow Y - t q^{T}$ .
Repeat for each component.

The regression coefficients are then: $\hat{B} = W (P^{T} W)^{- 1} Q^{T}$ .

Canonical Correlation Analysis (CCA): CCA finds pairs of linear combinations with maximum correlation:

$(w_{k}, c_{k}) = \arg \max_{w, c} corr (X w, Y c) = \arg \max_{w, c} \frac{w^{T} Σ_{X Y} c}{\sqrt{w^{T} Σ_{X X} w \cdot c^{T} Σ_{Y Y} c}}$

The solution involves the generalized eigenvalue problem:

$Σ_{X X}^{- 1} Σ_{X Y} Σ_{Y Y}^{- 1} Σ_{Y X} w = ρ^{2} w$

where $ρ$ is the canonical correlation. CCA differs from PLS in that it maximizes correlation rather than covariance, making it invariant to scaling of the variables.

Comparison:

Method	Objective	Best for
PLS	Max covariance	Prediction, high-dimensional $X$
CCA	Max correlation	Finding shared structure, equal dimensionality

Related Pages

Implementation:Scikit_learn_Scikit_learn_PLSRegression

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment