Implementation:Recommenders team Recommenders SSEPT Model
| Knowledge Sources | |
|---|---|
| Domains | Sequential Recommendation, Personalized Transformer, User Modeling |
| Last Updated | 2026-02-10 00:00 GMT |
Overview
The SSEPT class implements the SSE-PT (Stochastic Shared Embeddings via Personalized Transformer) model, extending SASRec with user-specific embeddings for personalized sequential recommendation.
Description
The SSEPT class (Wu et al., RecSys 2020) extends the SASREC base class by introducing user embeddings that are concatenated with item embeddings at each sequence position. This creates user-personalized item representations that allow the Transformer encoder to differentiate between users beyond their interaction history alone.
Key Architectural Differences from SASRec:
- User Embedding Layer: A new nn.Embedding layer maps user IDs to dense vectors of dimension user_embedding_dim. The user embedding is scaled by the square root of its dimension, consistent with the item embedding scaling.
- Hidden Dimension: The combined hidden dimension is item_embedding_dim + user_embedding_dim, which is used throughout the Transformer encoder, attention layers, and feedforward networks. This differs from SASRec where the hidden dimension equals the item embedding dimension.
- Positional Embeddings: Adapted to match the combined hidden dimension rather than just the item embedding dimension.
- Concatenated Representations: In the forward pass, user embeddings are replicated across all sequence positions and concatenated with item embeddings before being passed through the encoder.
- Prediction with User Context: Both forward and predict methods concatenate user embeddings with item embeddings for candidate items, ensuring the model scores items in a user-specific context.
Overridden Methods:
- __init__: Adds user embedding layer, adjusts encoder and normalization for the expanded hidden dimension.
- _init_weights: Extends parent initialization to also initialize the user embedding layer.
- forward: Concatenates user and item embeddings, applies the Transformer encoder, and computes logits using the combined representation.
- predict: Handles user embedding lookup and concatenation with candidate item embeddings during inference.
- create_combined_dataset: Includes user IDs in the training batch inputs.
The train_model and evaluate methods are inherited from SASREC, which dynamically handles the "users" key in inputs.
Usage
Use this model when user identity provides meaningful signal beyond interaction history for sequential recommendation. It is particularly beneficial when users have distinct behavioral patterns that persist across sessions, and when the dataset contains sufficient per-user interactions to learn meaningful user embeddings. SSEPT addresses the limitation of SASRec that treats all users identically and differentiates them only by their interaction sequences.
Code Reference
Source Location
- Repository: Recommenders
- File: recommenders/models/sasrec/ssept.py
- Lines: 1-268
Signature
class SSEPT(SASREC):
def __init__(self, **kwargs)
def _init_weights(self)
def forward(self, x, training=True)
def predict(self, inputs)
def create_combined_dataset(self, u, seq, pos, neg)
Import
from recommenders.models.sasrec.ssept import SSEPT
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| item_num | int | Yes | Number of items in the dataset |
| user_num | int | Yes | Number of users in the dataset |
| seq_max_len | int | No | Maximum sequence length for user history; default 100 |
| num_blocks | int | No | Number of Transformer encoder blocks; default 2 |
| embedding_dim | int | No | Base embedding dimension; default 100 |
| attention_dim | int | No | Transformer attention dimension; default 100 |
| attention_num_heads | int | No | Number of attention heads; default 1 |
| conv_dims | list | No | Dimensions of the feedforward layers; default [200, 200] |
| dropout_rate | float | No | Dropout probability; default 0.5 |
| l2_reg | float | No | L2 regularization coefficient; default 0.0 |
| num_neg_test | int | No | Number of negative examples used during evaluation; default 100 |
| user_embedding_dim | int | No | User embedding dimension; default equals embedding_dim |
| item_embedding_dim | int | No | Item embedding dimension; default equals embedding_dim |
Outputs
| Name | Type | Description |
|---|---|---|
| forward() | (torch.Tensor, torch.Tensor, torch.Tensor) | Positive logits, negative logits, and target mask for loss computation |
| predict() | torch.Tensor | Logits of shape (batch, num_candidates) for candidate items scored in user-specific context |
| train_model() | dict | Training history with 'loss', 'val_ndcg', and 'val_hr' lists (inherited from SASREC) |
| evaluate() | tuple of float | (NDCG@10, HR@10) metrics on the test set (inherited from SASREC) |
Usage Examples
Basic Usage
from recommenders.models.sasrec.ssept import SSEPT
# Initialize the SSEPT model with user embeddings
model = SSEPT(
item_num=10000,
user_num=5000,
seq_max_len=50,
num_blocks=2,
embedding_dim=64,
attention_dim=128,
attention_num_heads=2,
conv_dims=[128, 128],
dropout_rate=0.2,
l2_reg=1e-6,
user_embedding_dim=64,
item_embedding_dim=64,
num_neg_test=100,
)
# Train the model (uses inherited train_model from SASREC)
history = model.train_model(
dataset=dataset,
sampler=warp_sampler,
num_epochs=20,
batch_size=128,
learning_rate=0.001,
val_epoch=5,
)
# Evaluate on the test set
ndcg, hr = model.evaluate(dataset, seed=42)
print(f"Test NDCG@10: {ndcg:.4f}, HR@10: {hr:.4f}")
# Predict for specific users and candidate items
predictions = model.predict({
"user": user_ids, # (batch, 1) tensor of user indices
"input_seq": sequences, # (batch, seq_max_len) tensor of item indices
"candidate": candidates, # (batch, num_candidates) tensor of candidate items
})