Principle:Rapidsai Cuml Incremental Learning
| Knowledge Sources | |
|---|---|
| Domains | Machine_Learning, Dimensionality_Reduction, Online_Learning |
| Last Updated | 2026-02-08 12:00 GMT |
Overview
Incremental learning processes data in sequential mini-batches to fit models on datasets that exceed available memory, maintaining constant memory complexity while progressively updating model parameters with each new batch.
Description
Standard batch learning algorithms require the entire dataset to be loaded into memory simultaneously. For large-scale datasets, this can be prohibitive, especially on GPU where memory is more constrained than on CPU. Incremental (or online) learning addresses this limitation by processing data in manageable chunks, updating the model iteratively.
Incremental PCA (IPCA): Incremental Principal Component Analysis is the primary application of this principle. Standard PCA computes the singular value decomposition (SVD) of the entire data matrix, which requires memory and computation. Incremental PCA instead processes the data in batches of size , maintaining a running estimate of the principal components. For each batch:
- The batch is centered using the running mean estimate.
- The current components and the new batch are combined into a matrix of size , where is the number of components.
- An SVD is performed on this combined matrix to extract the updated components.
- The mean, variance, and singular values are updated incrementally.
This achieves memory complexity per batch, independent of the total dataset size. Sparse input matrices are handled by converting to dense format batch-by-batch, avoiding materialization of the full dense matrix.
The algorithm tracks cumulative statistics including the explained variance, explained variance ratio, noise variance (for feature dimensions beyond the retained components), and the number of samples seen. The batch size can be specified explicitly or defaults to (number of features), balancing approximation quality against memory usage.
Usage
Incremental learning is the right choice when:
- The dataset is too large to fit in GPU memory for standard PCA.
- Data arrives in a streaming fashion and the model needs to be updated as new data becomes available.
- Memory-mapped files or external storage are used and only portions of the data can be loaded at a time.
- An approximate decomposition is acceptable (IPCA converges to batch PCA given enough data).
- Sparse input data must be processed without converting the entire dataset to dense form.
Theoretical Basis
Incremental SVD Update:
Given current components V_k (k x d), singular values S_k, mean mu, count n_seen:
For each batch X_batch (b x d):
1. Update running mean: mu_new = weighted_mean(mu, mean(X_batch))
2. Center batch: X_c = X_batch - mu_new
3. Form combined matrix:
M = [ S_k * V_k ; X_c ] (size: (k + b) x d)
4. Compute SVD: M = U * S_new * V_new^T
5. Retain top k components: V_k = V_new[:k], S_k = S_new[:k]
6. Update n_seen += b
Memory Complexity:
per batch, where is the batch size, is the feature dimension, and is the number of retained components.
Explained Variance Ratio:
Failed to parse (syntax error): {\displaystyle \text{explained\_variance\_ratio}_i = \frac{s_i^2 / (n - 1)}{\sum_{j=1}^{d} \text{var}_j}}
where is the i-th singular value and is the variance of the j-th feature.