Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Recommenders team Recommenders SAR Model Training

From Leeroopedia


Knowledge Sources
Domains Recommender Systems, Collaborative Filtering, Model Training
Last Updated 2026-02-10 00:00 GMT

Overview

SAR model training is a direct, non-iterative computation that builds a user-affinity sparse matrix from interaction data, derives an item co-occurrence matrix, and transforms it into an item-item similarity matrix using the configured similarity metric.

Description

Unlike gradient-based approaches (matrix factorization, deep learning) that require iterative optimization over many epochs, SAR training is a deterministic, single-pass computation. The training process consists of four sequential stages:

  1. User-affinity matrix construction: Build a sparse matrix A of shape (|Users| x |Items|) where A[u, i] holds the (optionally time-decayed) rating that user u gave to item i. This matrix is stored in a compressed sparse format for memory efficiency.
  2. Item co-occurrence computation: Compute C = A^T x A, yielding a symmetric (|Items| x |Items|) matrix where C[i, j] counts how many users interacted with both items i and j. The diagonal C[i, i] gives the frequency of item i.
  3. Item-similarity derivation: Transform the co-occurrence matrix into a similarity matrix S using the configured metric (Jaccard, cosine, lift, co-occurrence, inclusion index, mutual information, or lexicographers mutual information).
  4. Optional normalization setup: If normalization is enabled, a separate unity-rating affinity matrix is computed (where all ratings are replaced by 1.0) to allow scaling prediction scores back to the original rating range.

The result of training is that the model stores the user-affinity matrix, the item-similarity matrix, and item frequency counts as internal state, ready for the scoring/recommendation step.

Usage

Use this training procedure when:

  • You have a user-item-rating DataFrame with no duplicate (user, item) pairs.
  • You want fast, non-iterative model fitting.
  • You need the model to store internal sparse matrices for subsequent scoring and recommendation.
  • You optionally want time-decayed ratings to emphasize recent interactions.

Theoretical Basis

Stage 1: User-Affinity Matrix

Construct sparse matrix A where:

A[u, i] = rating(u, i)          if timedecay_formula is False
A[u, i] = rating(u, i) * decay  if timedecay_formula is True

where decay = 2 ^ (-(T_now - t(u,i)) / half_life)

Stage 2: Item Co-occurrence Matrix

C = A^T x A

C[i, j] = sum over all users u: A[u, i] * A[u, j]
C[i, i] = f(i) = item frequency (number of users who rated item i)

Stage 3: Item-Similarity Matrix

The co-occurrence matrix C is transformed into a similarity matrix S. For example, using Jaccard:

S[i, j] = C[i, j] / (f(i) + f(j) - C[i, j])

Or using cosine similarity:

S[i, j] = C[i, j] / sqrt(f(i) * f(j))

Stage 4: Normalization (Optional)

When normalize=True, a unity user-affinity matrix A_unity is computed where all non-zero entries are 1.0 (optionally time-decayed). This is used during scoring to scale predictions:

raw_scores = A x S
unity_scores = A_unity x S
normalized_scores = rating_min + (raw_scores / unity_scores) * (rating_max - rating_min)

Computational complexity:

Stage Complexity Notes
Affinity matrix ratings|) Sparse matrix construction from COO format
Co-occurrence Users| * avg_items_per_user^2) Sparse matrix multiplication A^T x A
Similarity non-zero entries in C|) Element-wise transformation of sparse matrix

The entire process completes in a single pass with no convergence criteria or learning rate tuning.

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment