Implementation:Recommenders team Recommenders Standard VAE

Knowledge Sources	Recommenders
Domains	Collaborative Filtering, Deep Learning, Variational Autoencoders
Last Updated	2026-02-10 00:00 GMT

Overview

The StandardVAE class implements a Standard Variational Autoencoder for collaborative filtering, using binary cross-entropy as the reconstruction loss in the negative ELBO objective.

Description

The StandardVAE provides a VAE-based collaborative filtering model that treats each item independently as a Bernoulli variable, contrasting with the multinomial formulation in Mult_VAE. The architecture consists of:

Encoder -- Input is passed through dropout, a dense tanh hidden layer, and two linear layers producing the latent mean and log-variance vectors. Unlike Mult_VAE, there is no L2 normalization on the input.
Reparameterization Trick -- Samples from N(0, I) are transformed via the learned mean and variance to produce latent codes.
Decoder -- A dense tanh hidden layer with dropout, followed by a softmax output layer (contrasting with Mult_VAE's linear output).
Loss Function -- The negative ELBO is the sum of binary cross-entropy reconstruction loss (scaled by original_dim) and KL divergence weighted by beta.

The module shares the same callback infrastructure as the multinomial variant:

LossHistory -- Tracks training and validation loss per epoch.
Metrics -- Computes NDCG@k on the validation set and saves the best model weights.
AnnealingCallback -- Implements KL annealing by gradually increasing beta from 0 to a cap value.

Training uses a custom batch generator with shuffling, Adam optimizer, and ReduceLROnPlateau learning rate scheduling. KL annealing is optionally enabled via the annealing parameter.

Usage

Use StandardVAE as an alternative to Mult_VAE for implicit feedback collaborative filtering. The binary cross-entropy formulation may be preferred when items are better modeled as independent Bernoulli variables rather than as a single multinomial distribution. This provides a comparative baseline for evaluating which VAE loss formulation works better for a specific dataset.

Code Reference

Source Location

Repository: Recommenders
File: recommenders/models/vae/standard_vae.py
Lines: 1-491

Signature

class LossHistory(Callback):
    def on_train_begin(self, logs={})
    def on_epoch_end(self, epoch, logs={})

class Metrics(Callback):
    def __init__(self, model, val_tr, val_te, mapper, k, save_path=None)
    def recommend_k_items(self, x, k, remove_seen=True)
    def on_epoch_end(self, batch, logs={})
    def get_data(self)

class AnnealingCallback(Callback):
    def __init__(self, beta, anneal_cap, total_anneal_steps)
    def on_batch_end(self, epoch, logs={})
    def on_epoch_end(self, epoch, logs={})
    def get_data(self)

class StandardVAE:
    def __init__(self, n_users, original_dim, intermediate_dim=200, latent_dim=70,
                 n_epochs=400, batch_size=100, k=100, verbose=1,
                 drop_encoder=0.5, drop_decoder=0.5, beta=1.0,
                 annealing=False, anneal_cap=1.0, seed=None, save_path=None)
    def fit(self, x_train, x_valid, x_val_tr, x_val_te, mapper)
    def recommend_k_items(self, x, k, remove_seen=True)
    def get_optimal_beta(self)
    def display_metrics(self)
    def ndcg_per_epoch(self)

Import

from recommenders.models.vae.standard_vae import StandardVAE

I/O Contract

Inputs

Name	Type	Required	Description
n_users	int	Yes	Number of unique users in the training set
original_dim	int	Yes	Number of unique items in the training set
intermediate_dim	int	No	Dimension of the intermediate hidden layer (default: 200)
latent_dim	int	No	Dimension of the latent space (default: 70)
n_epochs	int	No	Number of training epochs (default: 400)
batch_size	int	No	Batch size for training (default: 100)
k	int	No	Number of top-k items per user for NDCG evaluation (default: 100)
drop_encoder	float	No	Dropout rate for the encoder (default: 0.5)
drop_decoder	float	No	Dropout rate for the decoder (default: 0.5)
beta	float	No	Constant KL divergence weight when not using annealing (default: 1.0)
annealing	bool	No	Whether to use KL annealing during training (default: False)
anneal_cap	float	No	Maximum beta value during annealing (default: 1.0)
seed	int	No	Random seed for reproducibility
save_path	str	No	Path to save the best model weights
x_train	numpy.ndarray	Yes	Click matrix for the training set (for fit)
x_valid	numpy.ndarray	Yes	Click matrix for the validation set (for fit)
x_val_tr	numpy.ndarray	Yes	Click matrix for the validation training partition (for fit)
x_val_te	numpy.ndarray	Yes	Click matrix for the validation testing partition (for fit)
mapper	object	Yes	AffinityMatrix mapper for converting sparse matrices to DataFrames (for fit)

Outputs

Name	Type	Description
recommend_k_items return	numpy.ndarray	Sparse matrix containing top-k elements ordered by score
get_optimal_beta return	float	Optimal beta value at the epoch with the highest NDCG@k
ndcg_per_epoch return	list	List of NDCG@k values at each training epoch

Usage Examples

Basic Usage

from recommenders.models.vae.standard_vae import StandardVAE

# Initialize the model
model = StandardVAE(
    n_users=n_users,
    original_dim=n_items,
    intermediate_dim=200,
    latent_dim=70,
    n_epochs=100,
    batch_size=100,
    k=100,
    annealing=True,
    anneal_cap=0.2,
    seed=42,
    save_path="best_standard_vae.h5"
)

# Train the model
model.fit(
    x_train=train_data,
    x_valid=valid_data,
    x_val_tr=val_train,
    x_val_te=val_test,
    mapper=am
)

# Generate top-k recommendations
top_k = model.recommend_k_items(x=test_data, k=10, remove_seen=True)

# Display training metrics
model.display_metrics()

# Check NDCG@k progression
ndcg_values = model.ndcg_per_epoch()

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment