Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Recommenders team Recommenders SSEPT Model

From Leeroopedia


Knowledge Sources
Domains Sequential Recommendation, Personalized Transformer, User Modeling
Last Updated 2026-02-10 00:00 GMT

Overview

The SSEPT class implements the SSE-PT (Stochastic Shared Embeddings via Personalized Transformer) model, extending SASRec with user-specific embeddings for personalized sequential recommendation.

Description

The SSEPT class (Wu et al., RecSys 2020) extends the SASREC base class by introducing user embeddings that are concatenated with item embeddings at each sequence position. This creates user-personalized item representations that allow the Transformer encoder to differentiate between users beyond their interaction history alone.

Key Architectural Differences from SASRec:

  • User Embedding Layer: A new nn.Embedding layer maps user IDs to dense vectors of dimension user_embedding_dim. The user embedding is scaled by the square root of its dimension, consistent with the item embedding scaling.
  • Hidden Dimension: The combined hidden dimension is item_embedding_dim + user_embedding_dim, which is used throughout the Transformer encoder, attention layers, and feedforward networks. This differs from SASRec where the hidden dimension equals the item embedding dimension.
  • Positional Embeddings: Adapted to match the combined hidden dimension rather than just the item embedding dimension.
  • Concatenated Representations: In the forward pass, user embeddings are replicated across all sequence positions and concatenated with item embeddings before being passed through the encoder.
  • Prediction with User Context: Both forward and predict methods concatenate user embeddings with item embeddings for candidate items, ensuring the model scores items in a user-specific context.

Overridden Methods:

  • __init__: Adds user embedding layer, adjusts encoder and normalization for the expanded hidden dimension.
  • _init_weights: Extends parent initialization to also initialize the user embedding layer.
  • forward: Concatenates user and item embeddings, applies the Transformer encoder, and computes logits using the combined representation.
  • predict: Handles user embedding lookup and concatenation with candidate item embeddings during inference.
  • create_combined_dataset: Includes user IDs in the training batch inputs.

The train_model and evaluate methods are inherited from SASREC, which dynamically handles the "users" key in inputs.

Usage

Use this model when user identity provides meaningful signal beyond interaction history for sequential recommendation. It is particularly beneficial when users have distinct behavioral patterns that persist across sessions, and when the dataset contains sufficient per-user interactions to learn meaningful user embeddings. SSEPT addresses the limitation of SASRec that treats all users identically and differentiates them only by their interaction sequences.

Code Reference

Source Location

Signature

class SSEPT(SASREC):
    def __init__(self, **kwargs)
    def _init_weights(self)
    def forward(self, x, training=True)
    def predict(self, inputs)
    def create_combined_dataset(self, u, seq, pos, neg)

Import

from recommenders.models.sasrec.ssept import SSEPT

I/O Contract

Inputs

Name Type Required Description
item_num int Yes Number of items in the dataset
user_num int Yes Number of users in the dataset
seq_max_len int No Maximum sequence length for user history; default 100
num_blocks int No Number of Transformer encoder blocks; default 2
embedding_dim int No Base embedding dimension; default 100
attention_dim int No Transformer attention dimension; default 100
attention_num_heads int No Number of attention heads; default 1
conv_dims list No Dimensions of the feedforward layers; default [200, 200]
dropout_rate float No Dropout probability; default 0.5
l2_reg float No L2 regularization coefficient; default 0.0
num_neg_test int No Number of negative examples used during evaluation; default 100
user_embedding_dim int No User embedding dimension; default equals embedding_dim
item_embedding_dim int No Item embedding dimension; default equals embedding_dim

Outputs

Name Type Description
forward() (torch.Tensor, torch.Tensor, torch.Tensor) Positive logits, negative logits, and target mask for loss computation
predict() torch.Tensor Logits of shape (batch, num_candidates) for candidate items scored in user-specific context
train_model() dict Training history with 'loss', 'val_ndcg', and 'val_hr' lists (inherited from SASREC)
evaluate() tuple of float (NDCG@10, HR@10) metrics on the test set (inherited from SASREC)

Usage Examples

Basic Usage

from recommenders.models.sasrec.ssept import SSEPT

# Initialize the SSEPT model with user embeddings
model = SSEPT(
    item_num=10000,
    user_num=5000,
    seq_max_len=50,
    num_blocks=2,
    embedding_dim=64,
    attention_dim=128,
    attention_num_heads=2,
    conv_dims=[128, 128],
    dropout_rate=0.2,
    l2_reg=1e-6,
    user_embedding_dim=64,
    item_embedding_dim=64,
    num_neg_test=100,
)

# Train the model (uses inherited train_model from SASREC)
history = model.train_model(
    dataset=dataset,
    sampler=warp_sampler,
    num_epochs=20,
    batch_size=128,
    learning_rate=0.001,
    val_epoch=5,
)

# Evaluate on the test set
ndcg, hr = model.evaluate(dataset, seed=42)
print(f"Test NDCG@10: {ndcg:.4f}, HR@10: {hr:.4f}")

# Predict for specific users and candidate items
predictions = model.predict({
    "user": user_ids,         # (batch, 1) tensor of user indices
    "input_seq": sequences,   # (batch, seq_max_len) tensor of item indices
    "candidate": candidates,  # (batch, num_candidates) tensor of candidate items
})

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment