Principle:Haifengl Smile Decomposition Result Extraction

Overview

Decomposition Result Extraction is the final stage of the Matrix_Decomposition_Pipeline, where the factored components produced by decompositions are accessed, manipulated, and used for downstream analysis. This includes extracting singular values and vectors from SVD, eigenvalues and eigenvectors from EVD, sorting eigenvalues, constructing diagonal matrices, and computing derived quantities such as matrix rank, condition number, null space, and range.

For large sparse matrices where full decomposition is infeasible, Smile provides ARPACK (Implicitly Restarted Arnoldi/Lanczos Method) for computing a small number of extremal eigenpairs or singular triples efficiently.

Theoretical Basis

SVD Result Interpretation

Given the SVD $A = U Σ V^{T}$ :

Component	Symbol	Interpretation
Singular values	$σ_{1} \geq σ_{2} \geq \dots \geq σ_{k}$	Importance of each component; square roots of eigenvalues of $A^{T} A$
Left singular vectors	$U = [u_{1}, u_{2}, \dots, u_{k}]$	Orthonormal basis for the column space of $A$
Right singular vectors	$V = [v_{1}, v_{2}, \dots, v_{k}]$	Orthonormal basis for the row space of $A$
Diagonal matrix	$Σ = diag (σ_{1}, \dots, σ_{k})$	The "stretching factors" of the transformation

Derived Quantities from SVD

Matrix Rank: The number of singular values above a threshold:

$r = # {i : σ_{i} > ϵ_{rcond}}$

where the threshold is $ϵ_{rcond} = \frac{1}{2} \sqrt{m + n + 1} \cdot σ_{1} \cdot ϵ_{mach}$ .

Condition Number: The ratio of the largest to smallest singular value:

$κ (A) = \frac{σ_{1}}{σ_{k}}$

A condition number close to 1 indicates a well-conditioned matrix. When $κ (A)$ is large or infinite, the matrix is ill-conditioned or singular.

Nullity: $nullity (A) = \min (m, n) - r$

Range Space: The first $r$ columns of $U$ form an orthonormal basis for $range (A)$ .

Null Space: The last $nullity$ rows of $V^{T}$ (equivalently, last columns of $V$ ) form an orthonormal basis for $null (A)$ .

The Eckart-Young Theorem

The truncated SVD provides the best low-rank matrix approximation. Keeping only the top $r$ singular values:

$A_{r} = \sum_{i = 1}^{r} σ_{i} u_{i} v_{i}^{T}$

minimizes $‖ A - A_{r} ‖_{F}$ and $‖ A - A_{r} ‖_{2}$ over all rank- $r$ matrices. This is the theoretical basis for PCA, latent semantic indexing, and data compression.

EVD Result Interpretation

Given the eigenvalue decomposition of a square matrix $A$ :

Symmetric Case

$A = V Λ V^{T}$

where $Λ = diag (λ_{1}, \dots, λ_{n})$ with all real eigenvalues and $V$ is orthogonal.

Spectral properties:

All eigenvalues are real
Eigenvectors corresponding to distinct eigenvalues are orthogonal
$tr (A) = \sum_{i} λ_{i}$
$\det (A) = \prod_{i} λ_{i}$
Positive definite iff all $λ_{i} > 0$

General (Non-Symmetric) Case

Eigenvalues may be complex: $λ_{i} = w_{r, i} + j \cdot w_{i, i}$

The diagonal matrix uses a block structure:

Real eigenvalue $λ_{i}$ : a 1x1 block on the diagonal
Complex conjugate pair $λ_{i} = a + b i$ : a 2x2 block $[\begin{matrix} a & b \\ - b & a \end{matrix}]$

Sorting Eigenvalues

Eigenvalues are sorted by descending magnitude:

$| λ_{1} |^{2} \geq | λ_{2} |^{2} \geq \dots \geq | λ_{n} |^{2}$

where $| λ_{i} |^{2} = w_{r, i}^{2} + w_{i, i}^{2}$ . The corresponding eigenvectors are reordered accordingly.

ARPACK: Large Sparse Eigenproblems

For matrices too large for full EVD ( $O (n^{3})$ ), ARPACK computes a small number $k ≪ n$ of eigenvalues using the Implicitly Restarted Arnoldi Method (IRAM):

For symmetric matrices (Lanczos variant):

Builds a Krylov subspace $𝒦_{k} (A, v_{0}) = span {v_{0}, A v_{0}, A^{2} v_{0}, \dots, A^{k - 1} v_{0}}$
Extracts eigenvalue approximations (Ritz values) from the tridiagonal projection
Applies implicit QR shifts to restart with a refined subspace

Which eigenvalues to compute (SymmOption enum):

Option	Description	Use Case
LA	Largest algebraic	PCA (largest positive eigenvalues)
SA	Smallest algebraic	Graph Laplacian (near-zero eigenvalues)
LM	Largest magnitude	Dominant modes, spectral radius
SM	Smallest magnitude	Near-singular directions
BE	Both ends	Spectral gap analysis

Computational complexity: $O (n \cdot k \cdot ncv)$ where ncv is the number of Arnoldi vectors (typically $3 k$ ), compared to $O (n^{3})$ for full EVD.

Key property: ARPACK only requires the action of $A$ on a vector (matrix-vector product), not the matrix entries themselves. This enables use with sparse matrices and implicit matrix representations.

ARPACK SVD

ARPACK can also compute the top- $k$ singular triples by solving the equivalent eigenproblem on $A^{T} A$ (via the AtA implicit matrix wrapper) and recovering the singular vectors.

Relationship to the Pipeline

Construction --> Arithmetic --> Decomposition --> Solving --> Result Extraction
                                                                    ^
                                                                    |
                                                             (this principle)

Result extraction is the final stage that produces the quantities used by ML algorithms: eigenvalues for PCA, singular values for matrix rank, eigenvectors for spectral clustering, and so on.

Knowledge Sources

Domains

Linear_Algebra, Numerical_Computing, Dimensionality_Reduction, Spectral_Analysis

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment