Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Rapidsai Cuml Incremental Learning

From Leeroopedia


Knowledge Sources
Domains Machine_Learning, Dimensionality_Reduction, Online_Learning
Last Updated 2026-02-08 12:00 GMT

Overview

Incremental learning processes data in sequential mini-batches to fit models on datasets that exceed available memory, maintaining constant memory complexity while progressively updating model parameters with each new batch.

Description

Standard batch learning algorithms require the entire dataset to be loaded into memory simultaneously. For large-scale datasets, this can be prohibitive, especially on GPU where memory is more constrained than on CPU. Incremental (or online) learning addresses this limitation by processing data in manageable chunks, updating the model iteratively.

Incremental PCA (IPCA): Incremental Principal Component Analysis is the primary application of this principle. Standard PCA computes the singular value decomposition (SVD) of the entire data matrix, which requires O(nd2) memory and computation. Incremental PCA instead processes the data in batches of size b, maintaining a running estimate of the principal components. For each batch:

  1. The batch is centered using the running mean estimate.
  2. The current components and the new batch are combined into a matrix of size (k+b)×d, where k is the number of components.
  3. An SVD is performed on this combined matrix to extract the updated components.
  4. The mean, variance, and singular values are updated incrementally.

This achieves O(bd) memory complexity per batch, independent of the total dataset size. Sparse input matrices are handled by converting to dense format batch-by-batch, avoiding materialization of the full dense matrix.

The algorithm tracks cumulative statistics including the explained variance, explained variance ratio, noise variance (for feature dimensions beyond the retained components), and the number of samples seen. The batch size can be specified explicitly or defaults to 5×d (number of features), balancing approximation quality against memory usage.

Usage

Incremental learning is the right choice when:

  • The dataset is too large to fit in GPU memory for standard PCA.
  • Data arrives in a streaming fashion and the model needs to be updated as new data becomes available.
  • Memory-mapped files or external storage are used and only portions of the data can be loaded at a time.
  • An approximate decomposition is acceptable (IPCA converges to batch PCA given enough data).
  • Sparse input data must be processed without converting the entire dataset to dense form.

Theoretical Basis

Incremental SVD Update:

Given current components V_k (k x d), singular values S_k, mean mu, count n_seen:

For each batch X_batch (b x d):
    1. Update running mean: mu_new = weighted_mean(mu, mean(X_batch))
    2. Center batch: X_c = X_batch - mu_new
    3. Form combined matrix:
       M = [ S_k * V_k ; X_c ]    (size: (k + b) x d)
    4. Compute SVD: M = U * S_new * V_new^T
    5. Retain top k components: V_k = V_new[:k], S_k = S_new[:k]
    6. Update n_seen += b

Memory Complexity:

O(bd+kd) per batch, where b is the batch size, d is the feature dimension, and k is the number of retained components.

Explained Variance Ratio:

Failed to parse (syntax error): {\displaystyle \text{explained\_variance\_ratio}_i = \frac{s_i^2 / (n - 1)}{\sum_{j=1}^{d} \text{var}_j}}

where si is the i-th singular value and varj is the variance of the j-th feature.

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment