Principle:Haifengl Smile Matrix Arithmetic
Overview
Matrix Arithmetic encompasses the fundamental algebraic operations on matrices: multiplication (GEMM), matrix-vector products (GEMV), addition, subtraction, scaling, transposition, and rank-1 updates. These operations form the computational backbone of all higher-level linear algebra algorithms, including matrix decompositions, linear system solving, and iterative eigensolvers.
In Smile, matrix arithmetic is accelerated by delegating to native BLAS (Basic Linear Algebra Subprograms) routines via the Java Foreign Function & Memory API. BLAS organizes operations into three levels based on computational complexity and data access patterns, each offering progressively better cache utilization.
Theoretical Basis
BLAS Levels
| Level | Operations | Complexity | Data Movement |
|---|---|---|---|
| Level 1 | Vector-vector: dot, axpy, scale | Memory-bound | |
| Level 2 | Matrix-vector: GEMV, symmetric MV | Memory-bound | |
| Level 3 | Matrix-matrix: GEMM, symmetric MM | Compute-bound |
General Matrix Multiply (GEMM)
GEMM is the single most important kernel in scientific computing and machine learning. It computes:
where is either (no transpose) or (transpose), and are scalar multipliers.
Key properties:
- is , is , and is (after applying transpositions)
- When and , this reduces to
- When , this accumulates into :
- Computational complexity: for a general case, which is for square matrices
General Matrix-Vector Multiply (GEMV)
GEMV computes the product of a matrix and a vector:
This is the fundamental operation in iterative methods (conjugate gradient, Lanczos, Arnoldi), where only the action of a matrix on a vector is needed.
Specialized variants:
- SYMV: where is symmetric -- only half the matrix is accessed
- TRMV: where is triangular -- fewer operations needed
Matrix Addition and Subtraction
Element-wise operations that combine matrices of identical dimensions:
Smile implements this via the BLAS Level 1 AXPY operation () applied to the linearized matrix arrays. For the common case , this is equivalent to .
Transposition
The transpose of a matrix is the matrix where .
Key identities:
In Smile, transpose() creates a new matrix with copied data (not a transposed view), ensuring column-major layout for the result.
Rank-1 Update (GER)
The rank-1 update operation:
where and . This is the outer product scaled by , added to . It is fundamental to algorithms like the Sherman-Morrison formula and online covariance matrix updates.
Symmetric Matrix Optimization
When a matrix is flagged as symmetric (via UPLO), Smile automatically dispatches to optimized BLAS routines:
| General Routine | Symmetric Variant | Speedup Factor |
|---|---|---|
cblas_dgemm |
cblas_dsymm |
~2x (reads half the data) |
cblas_dgemv |
cblas_dsymv |
~2x |
Mathematical Formulations
Common Operations in Smile
| Method | Mathematical Operation | BLAS Routine |
|---|---|---|
A.mm(B) |
GEMM with | |
A.tm(B) |
GEMM with transA | |
A.mt(B) |
GEMM with transB | |
A.ata() |
GEMM (result is symmetric) | |
A.aat() |
GEMM (result is symmetric) | |
A.mv(x) |
GEMV | |
A.tv(x) |
GEMV with transpose | |
A.add(B) |
AXPY with | |
A.sub(B) |
AXPY with | |
A.scale(alpha) |
SCAL | |
A.ger(alpha, x, y) |
GER |
Performance Characteristics
GEMM Performance Model
For an by multiplication:
- Floating point operations:
- Memory accesses:
- Arithmetic intensity: where is the element size
For large square matrices, GEMM achieves near-peak floating point throughput because the arithmetic intensity grows with matrix size, allowing the computation to be fully cache-resident through blocking strategies.
Relationship to the Pipeline
Matrix arithmetic is the second stage of the Matrix_Decomposition_Pipeline, sitting between construction and decomposition:
Construction --> Arithmetic --> Decomposition --> Solving --> Result Extraction
Decomposition algorithms (LU, QR, SVD, EVD) internally use GEMM, GEMV, and other arithmetic primitives. ARPACK's Implicitly Restarted Arnoldi Method relies exclusively on matrix-vector products (mv/tv).
Related
Implementation:Haifengl_Smile_DenseMatrix_OperationsPrinciple:Haifengl_Smile_Matrix_ConstructionPrinciple:Haifengl_Smile_Matrix_Decomposition
Knowledge Sources
Domains
Linear_Algebra, Numerical_Computing, High_Performance_Computing