Workflow:Haifengl Smile Matrix Decomposition Pipeline
| Knowledge Sources | |
|---|---|
| Domains | Linear_Algebra, Numerical_Computing |
| Last Updated | 2026-02-08 21:00 GMT |
Overview
End-to-end process for creating dense or sparse matrices, performing matrix decompositions (LU, QR, SVD, EVD, Cholesky), and solving linear systems using Smile's native BLAS/LAPACK-backed tensor module.
Description
This workflow covers numerical linear algebra operations in Smile using the smile.tensor package. It starts with constructing matrices from data (arrays, DataFrames, or programmatic generation), then applying standard matrix decompositions. These decompositions are fundamental building blocks for many machine learning algorithms including PCA, least squares regression, and eigenvalue analysis. The tensor module uses Java Foreign Function & Memory API to call native BLAS and LAPACK routines for high-performance computation.
Usage
Execute this workflow when you need to perform linear algebra operations such as solving linear systems (Ax = b), computing eigenvalues and eigenvectors, performing dimensionality reduction via SVD, or any operation requiring matrix factorization. This is common in regression, PCA, signal processing, and numerical optimization tasks.
Execution Steps
Step 1: Construct the Matrix
Create a DenseMatrix from raw data. Matrices can be constructed from 2D double arrays, from DataFrame columns, or using factory methods for special matrices (identity, diagonal, random). For sparse data, use SparseMatrix in CSC (Compressed Sparse Column) format. Both float32 and float64 precision are supported.
Key considerations:
- DenseMatrix64 for double precision, DenseMatrix32 for single precision
- Matrix storage uses column-major order (Fortran convention) for BLAS compatibility
- Memory is allocated using Java Foreign Memory API for direct native access
- SymmMatrix provides optimized storage for symmetric matrices
- BandMatrix provides optimized storage for banded matrices
Step 2: Perform Matrix Operations
Execute basic matrix arithmetic including addition, subtraction, multiplication, transpose, and element-wise operations. Matrix-vector and matrix-matrix products leverage native BLAS routines (dgemm, dgemv) for performance.
Key considerations:
- Matrix multiplication uses native BLAS dgemm/sgemm for optimal performance
- Transpose operations support both lazy (view) and eager (copy) modes
- Element-wise operations are available for add, sub, mul, div
- The mm() method performs matrix-matrix multiplication
- The mv() method performs matrix-vector multiplication
Step 3: Apply Matrix Decomposition
Choose and apply the appropriate matrix decomposition based on the problem structure. Available decompositions include LU (general square), QR (rectangular), SVD (general), EVD (eigenvalue), and Cholesky (symmetric positive definite). Each decomposition is computed via native LAPACK routines.
Key considerations:
- LU decomposition: for solving general linear systems and computing determinants
- QR decomposition: for least squares problems and orthogonalization
- SVD: for computing singular values, pseudoinverse, and low-rank approximations
- EVD: for eigenvalue analysis of square matrices
- Cholesky: for symmetric positive-definite systems (fastest)
- ARPACK can be used for computing a few eigenvalues of large sparse matrices
Step 4: Solve the Linear System
Use the decomposition results to solve linear systems, compute inverses, or extract structural information. Each decomposition object provides a solve() method that efficiently solves Ax = b using the pre-computed factors.
Key considerations:
- LU.solve() solves general linear systems via forward/back substitution
- QR.solve() solves least squares problems
- Cholesky.solve() is the most efficient for symmetric positive definite systems
- SVD provides rank, condition number, and pseudoinverse
- EVD provides eigenvalues and eigenvectors for spectral analysis
Step 5: Extract and Use Results
Extract the computed results (eigenvalues, singular values, factors) for downstream analysis. Results can be converted back to arrays or used directly in machine learning algorithms like PCA, spectral clustering, or regression.
Key considerations:
- SVD.U(), SVD.V(), and SVD.s() provide the factor matrices and singular values
- EVD.eigenvalues() and EVD.eigenvectors() provide spectral decomposition results
- Decomposition results are used internally by PCA, MDS, spectral methods
- For large sparse matrices, use ARPACK for iterative eigenvalue computation