Implementation:Recommenders team Recommenders Standard VAE
| Knowledge Sources | |
|---|---|
| Domains | Collaborative Filtering, Deep Learning, Variational Autoencoders |
| Last Updated | 2026-02-10 00:00 GMT |
Overview
The StandardVAE class implements a Standard Variational Autoencoder for collaborative filtering, using binary cross-entropy as the reconstruction loss in the negative ELBO objective.
Description
The StandardVAE provides a VAE-based collaborative filtering model that treats each item independently as a Bernoulli variable, contrasting with the multinomial formulation in Mult_VAE. The architecture consists of:
- Encoder -- Input is passed through dropout, a dense tanh hidden layer, and two linear layers producing the latent mean and log-variance vectors. Unlike Mult_VAE, there is no L2 normalization on the input.
- Reparameterization Trick -- Samples from N(0, I) are transformed via the learned mean and variance to produce latent codes.
- Decoder -- A dense tanh hidden layer with dropout, followed by a softmax output layer (contrasting with Mult_VAE's linear output).
- Loss Function -- The negative ELBO is the sum of binary cross-entropy reconstruction loss (scaled by original_dim) and KL divergence weighted by beta.
The module shares the same callback infrastructure as the multinomial variant:
- LossHistory -- Tracks training and validation loss per epoch.
- Metrics -- Computes NDCG@k on the validation set and saves the best model weights.
- AnnealingCallback -- Implements KL annealing by gradually increasing beta from 0 to a cap value.
Training uses a custom batch generator with shuffling, Adam optimizer, and ReduceLROnPlateau learning rate scheduling. KL annealing is optionally enabled via the annealing parameter.
Usage
Use StandardVAE as an alternative to Mult_VAE for implicit feedback collaborative filtering. The binary cross-entropy formulation may be preferred when items are better modeled as independent Bernoulli variables rather than as a single multinomial distribution. This provides a comparative baseline for evaluating which VAE loss formulation works better for a specific dataset.
Code Reference
Source Location
- Repository: Recommenders
- File: recommenders/models/vae/standard_vae.py
- Lines: 1-491
Signature
class LossHistory(Callback):
def on_train_begin(self, logs={})
def on_epoch_end(self, epoch, logs={})
class Metrics(Callback):
def __init__(self, model, val_tr, val_te, mapper, k, save_path=None)
def recommend_k_items(self, x, k, remove_seen=True)
def on_epoch_end(self, batch, logs={})
def get_data(self)
class AnnealingCallback(Callback):
def __init__(self, beta, anneal_cap, total_anneal_steps)
def on_batch_end(self, epoch, logs={})
def on_epoch_end(self, epoch, logs={})
def get_data(self)
class StandardVAE:
def __init__(self, n_users, original_dim, intermediate_dim=200, latent_dim=70,
n_epochs=400, batch_size=100, k=100, verbose=1,
drop_encoder=0.5, drop_decoder=0.5, beta=1.0,
annealing=False, anneal_cap=1.0, seed=None, save_path=None)
def fit(self, x_train, x_valid, x_val_tr, x_val_te, mapper)
def recommend_k_items(self, x, k, remove_seen=True)
def get_optimal_beta(self)
def display_metrics(self)
def ndcg_per_epoch(self)
Import
from recommenders.models.vae.standard_vae import StandardVAE
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| n_users | int | Yes | Number of unique users in the training set |
| original_dim | int | Yes | Number of unique items in the training set |
| intermediate_dim | int | No | Dimension of the intermediate hidden layer (default: 200) |
| latent_dim | int | No | Dimension of the latent space (default: 70) |
| n_epochs | int | No | Number of training epochs (default: 400) |
| batch_size | int | No | Batch size for training (default: 100) |
| k | int | No | Number of top-k items per user for NDCG evaluation (default: 100) |
| drop_encoder | float | No | Dropout rate for the encoder (default: 0.5) |
| drop_decoder | float | No | Dropout rate for the decoder (default: 0.5) |
| beta | float | No | Constant KL divergence weight when not using annealing (default: 1.0) |
| annealing | bool | No | Whether to use KL annealing during training (default: False) |
| anneal_cap | float | No | Maximum beta value during annealing (default: 1.0) |
| seed | int | No | Random seed for reproducibility |
| save_path | str | No | Path to save the best model weights |
| x_train | numpy.ndarray | Yes | Click matrix for the training set (for fit) |
| x_valid | numpy.ndarray | Yes | Click matrix for the validation set (for fit) |
| x_val_tr | numpy.ndarray | Yes | Click matrix for the validation training partition (for fit) |
| x_val_te | numpy.ndarray | Yes | Click matrix for the validation testing partition (for fit) |
| mapper | object | Yes | AffinityMatrix mapper for converting sparse matrices to DataFrames (for fit) |
Outputs
| Name | Type | Description |
|---|---|---|
| recommend_k_items return | numpy.ndarray | Sparse matrix containing top-k elements ordered by score |
| get_optimal_beta return | float | Optimal beta value at the epoch with the highest NDCG@k |
| ndcg_per_epoch return | list | List of NDCG@k values at each training epoch |
Usage Examples
Basic Usage
from recommenders.models.vae.standard_vae import StandardVAE
# Initialize the model
model = StandardVAE(
n_users=n_users,
original_dim=n_items,
intermediate_dim=200,
latent_dim=70,
n_epochs=100,
batch_size=100,
k=100,
annealing=True,
anneal_cap=0.2,
seed=42,
save_path="best_standard_vae.h5"
)
# Train the model
model.fit(
x_train=train_data,
x_valid=valid_data,
x_val_tr=val_train,
x_val_te=val_test,
mapper=am
)
# Generate top-k recommendations
top_k = model.recommend_k_items(x=test_data, k=10, remove_seen=True)
# Display training metrics
model.display_metrics()
# Check NDCG@k progression
ndcg_values = model.ndcg_per_epoch()