Principle:Recommenders team Recommenders ALS Matrix Factorization

Knowledge Sources	Recommenders Collaborative Filtering for Implicit Feedback Datasets PySpark ALS Documentation
Domains	Matrix Factorization, Recommendation Systems, Distributed Computing
Last Updated	2026-02-10 00:00 GMT

Overview

Alternating Least Squares (ALS) decomposes a user-item rating matrix into two lower-rank factor matrices by alternating between fixing one and solving for the other, enabling scalable collaborative filtering on distributed clusters.

Description

ALS is a matrix factorization algorithm designed for large-scale collaborative filtering. Given a sparse user-item rating matrix R of shape (m x n), ALS finds two dense factor matrices U (m x k) and V (n x k) such that R is approximately equal to U x V^T. The key insight is that while jointly optimizing both U and V is non-convex, fixing one and solving for the other reduces to a set of independent least squares problems that can be solved in closed form.

The algorithm alternates between two steps:

Fix V, solve for U:' For each user u, solve the regularized least squares problem to find the latent factor vector U_u that best explains user us observed ratings given the current item factors V.
Fix U, solve for V: For each item i, solve the analogous problem to find V_i given the current user factors U.

This alternating structure is embarrassingly parallel: in step 1, each user's optimization is independent of every other user; in step 2, each item's optimization is independent. This makes ALS naturally suited for distributed computing frameworks like Spark, where each partition can solve its subset of problems independently.

ALS also supports implicit feedback via the confidence-weighted formulation of Hu, Koren, and Volinsky (2008). Instead of treating missing entries as unknown, implicit ALS treats all entries as observed (with preference 0 for missing items) but assigns confidence proportional to the interaction count. This transforms the objective to minimize a weighted squared error where frequently interacted items receive higher confidence.

Usage

Use ALS when you have a user-item interaction matrix (explicit ratings or implicit feedback) and need a scalable collaborative filtering model. ALS is the recommended matrix factorization method for Spark-based recommendation pipelines because its embarrassingly parallel structure maps directly to Spark's distributed execution model. It is the core model training step, following data loading and splitting.

Theoretical Basis

Explicit Feedback Formulation

Given the partially observed rating matrix R, ALS minimizes the regularized squared error over observed entries:

min_{U, V}  sum_{(u,i) in observed} (R_ui - U_u . V_i^T)^2
            + lambda * (sum_u ||U_u||^2 + sum_i ||V_i||^2)

where:
  U_u = latent factor vector for user u (dimension k)
  V_i = latent factor vector for item i (dimension k)
  lambda = regularization parameter (regParam)
  k = rank (number of latent factors)

ALS Algorithm

Algorithm: Alternating Least Squares
Input: Observed ratings {(u, i, r_ui)}, rank k, lambda, maxIter
Output: U (m x k), V (n x k)

1. Initialize V randomly
2. For iter = 1 to maxIter:
   a. Fix V, solve for each U_u:
      U_u = (V_Iu^T * V_Iu + lambda * I)^{-1} * V_Iu^T * R_u
      where:
        I_u = set of items rated by user u
        V_Iu = rows of V corresponding to items in I_u
        R_u = vector of user u's ratings for items in I_u

   b. Fix U, solve for each V_i:
      V_i = (U_Ii^T * U_Ii + lambda * I)^{-1} * U_Ii^T * R_i
      where:
        I_i = set of users who rated item i
        U_Ii = rows of U corresponding to users in I_i
        R_i = vector of ratings for item i from users in I_i

3. Return U, V

Implicit Feedback Extension

For implicit feedback (Hu, Koren, Volinsky 2008), the objective becomes:

min_{U, V}  sum_{u,i} c_ui * (p_ui - U_u . V_i^T)^2
            + lambda * (sum_u ||U_u||^2 + sum_i ||V_i||^2)

where:
  p_ui = 1 if user u interacted with item i, else 0
  c_ui = 1 + alpha * r_ui  (confidence increases with interaction count)
  alpha = confidence scaling parameter

The alternating optimization steps remain closed-form but now incorporate the confidence matrix C as weights.

Cold Start Handling

The coldStartStrategy="drop" parameter instructs the model to drop (assign NaN to) predictions for user-item pairs where either the user or item was not seen during training. This prevents NaN predictions from propagating into evaluation metrics.

Related Pages

Implemented By

Implementation:Recommenders_team_Recommenders_PySpark_ALS

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment