Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Haifengl Smile Matrix Arithmetic

From Leeroopedia


Overview

Matrix Arithmetic encompasses the fundamental algebraic operations on matrices: multiplication (GEMM), matrix-vector products (GEMV), addition, subtraction, scaling, transposition, and rank-1 updates. These operations form the computational backbone of all higher-level linear algebra algorithms, including matrix decompositions, linear system solving, and iterative eigensolvers.

In Smile, matrix arithmetic is accelerated by delegating to native BLAS (Basic Linear Algebra Subprograms) routines via the Java Foreign Function & Memory API. BLAS organizes operations into three levels based on computational complexity and data access patterns, each offering progressively better cache utilization.

Theoretical Basis

BLAS Levels

Level Operations Complexity Data Movement
Level 1 Vector-vector: dot, axpy, scale O(n) Memory-bound
Level 2 Matrix-vector: GEMV, symmetric MV O(n2) Memory-bound
Level 3 Matrix-matrix: GEMM, symmetric MM O(n3) Compute-bound

General Matrix Multiply (GEMM)

GEMM is the single most important kernel in scientific computing and machine learning. It computes:

C=αop(A)op(B)+βC

where op(X) is either X (no transpose) or XT (transpose), and α,β are scalar multipliers.

Key properties:

  • A is m×k, B is k×n, and C is m×n (after applying transpositions)
  • When α=1 and β=0, this reduces to C=AB
  • When β=1, this accumulates into C: C+=αAB
  • Computational complexity: O(mnk) for a general case, which is O(n3) for square matrices

General Matrix-Vector Multiply (GEMV)

GEMV computes the product of a matrix and a vector:

y=αop(A)x+βy

This is the fundamental operation in iterative methods (conjugate gradient, Lanczos, Arnoldi), where only the action of a matrix on a vector is needed.

Specialized variants:

  • SYMV: y=αAx+βy where A is symmetric -- only half the matrix is accessed
  • TRMV: y=Ax where A is triangular -- fewer operations needed

Matrix Addition and Subtraction

Element-wise operations that combine matrices of identical dimensions:

C=αA+βB

Smile implements this via the BLAS Level 1 AXPY operation (y=αx+y) applied to the linearized matrix arrays. For the common case A+=B, this is equivalent to axpy(1.0,B,A).

Transposition

The transpose AT of a matrix Am×n is the matrix Bn×m where bij=aji.

Key identities:

  • (AB)T=BTAT
  • (AT)T=A
  • (A+B)T=AT+BT

In Smile, transpose() creates a new matrix with copied data (not a transposed view), ensuring column-major layout for the result.

Rank-1 Update (GER)

The rank-1 update operation:

A=A+αxyT

where xm and yn. This is the outer product scaled by α, added to A. It is fundamental to algorithms like the Sherman-Morrison formula and online covariance matrix updates.

Symmetric Matrix Optimization

When a matrix is flagged as symmetric (via UPLO), Smile automatically dispatches to optimized BLAS routines:

General Routine Symmetric Variant Speedup Factor
cblas_dgemm cblas_dsymm ~2x (reads half the data)
cblas_dgemv cblas_dsymv ~2x

Mathematical Formulations

Common Operations in Smile

Method Mathematical Operation BLAS Routine
A.mm(B) C=AB GEMM with α=1,β=0
A.tm(B) C=ATB GEMM with transA
A.mt(B) C=ABT GEMM with transB
A.ata() C=ATA GEMM (result is symmetric)
A.aat() C=AAT GEMM (result is symmetric)
A.mv(x) y=Ax GEMV
A.tv(x) y=ATx GEMV with transpose
A.add(B) A+=B AXPY with α=1
A.sub(B) A=B AXPY with α=1
A.scale(alpha) A=αA SCAL
A.ger(alpha, x, y) A+=αxyT GER

Performance Characteristics

GEMM Performance Model

For an m×k by k×n multiplication:

  • Floating point operations: 2mnk
  • Memory accesses: O(mn+mk+kn)
  • Arithmetic intensity: 2mnk(mn+mk+kn)s where s is the element size

For large square matrices, GEMM achieves near-peak floating point throughput because the arithmetic intensity grows with matrix size, allowing the computation to be fully cache-resident through blocking strategies.

Relationship to the Pipeline

Matrix arithmetic is the second stage of the Matrix_Decomposition_Pipeline, sitting between construction and decomposition:

Construction --> Arithmetic --> Decomposition --> Solving --> Result Extraction

Decomposition algorithms (LU, QR, SVD, EVD) internally use GEMM, GEMV, and other arithmetic primitives. ARPACK's Implicitly Restarted Arnoldi Method relies exclusively on matrix-vector products (mv/tv).

Related

Knowledge Sources

Domains

Linear_Algebra, Numerical_Computing, High_Performance_Computing

Categories

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment