Implementation:Recommenders team Recommenders Benchmark Predict And Recommend
| Knowledge Sources | |
|---|---|
| Domains | Recommender Systems, Benchmarking, Prediction |
| Last Updated | 2026-02-10 00:00 GMT |
Overview
Concrete tool for generating rating predictions and top-K recommendations across all benchmarked algorithms with timing instrumentation.
Description
The predict_* and recommend_k_* function families in benchmark_utils.py provide standardized interfaces for generating predictions and recommendations from trained models. Each function wraps the algorithm-specific prediction logic in a Timer context manager and returns a (results, Timer) tuple.
Rating Prediction Functions (predict_*):
- predict_als: Calls
model.transform(test)on the Spark test DataFrame. - predict_svd: Uses
surprise_utils.predict()to generate predictions for test user-item pairs. - predict_embdotbias: Uses
score()utility to predict ratings for test pairs.
Top-K Recommendation Functions (recommend_k_*):
- recommend_k_sar: Calls
model.recommend_k_items(test, top_k, remove_seen). - recommend_k_als: Performs a cross-join of all users and items, scores with the model, then removes seen items via outer join.
- recommend_k_svd: Uses
compute_ranking_predictions()from surprise_utils. - recommend_k_ncf: Iterates over all users, scores all items per user, then removes seen items via pandas merge.
- recommend_k_bpr: Calls
model.recommend_k_items()from the BPR wrapper. - recommend_k_bivae: Uses
predict_ranking()from cornac_utils. - recommend_k_embdotbias: Builds a Cartesian product of test users and all items, optionally removes seen items, then scores with
score(). - recommend_k_lightgcn: Calls
model.recommend_k_items(test, top_k, remove_seen).
Usage
Use these functions after training models in the benchmark loop. They are registered in dispatch dictionaries (one for rating prediction, one for ranking prediction), enabling the benchmark loop to call them generically based on algorithm capabilities.
Code Reference
Source Location
- Repository: recommenders
- File:
examples/06_benchmarks/benchmark_utils.py(Lines 112-400)
Signature
# Rating prediction functions
def predict_als(model, test) -> tuple[pyspark.sql.DataFrame, Timer]
def predict_svd(model, test) -> tuple[pd.DataFrame, Timer]
def predict_embdotbias(model, test) -> tuple[pd.DataFrame, Timer]
# Top-K recommendation functions
def recommend_k_sar(model, test, train, top_k=DEFAULT_K, remove_seen=True) -> tuple[pd.DataFrame, Timer]
def recommend_k_als(model, test, train, top_k=DEFAULT_K, remove_seen=True) -> tuple[pyspark.sql.DataFrame, Timer]
def recommend_k_svd(model, test, train, top_k=DEFAULT_K, remove_seen=True) -> tuple[pd.DataFrame, Timer]
def recommend_k_ncf(model, test, train, top_k=DEFAULT_K, remove_seen=True) -> tuple[pd.DataFrame, Timer]
def recommend_k_bpr(model, test, train, top_k=DEFAULT_K, remove_seen=True) -> tuple[pd.DataFrame, Timer]
def recommend_k_bivae(model, test, train, top_k=DEFAULT_K, remove_seen=True) -> tuple[pd.DataFrame, Timer]
def recommend_k_embdotbias(model, test, train, top_k=DEFAULT_K, remove_seen=True) -> tuple[pd.DataFrame, Timer]
def recommend_k_lightgcn(model, test, train, top_k=DEFAULT_K, remove_seen=True) -> tuple[pd.DataFrame, Timer]
Import
import sys
sys.path.append("examples/06_benchmarks")
from benchmark_utils import (
predict_als,
predict_svd,
predict_embdotbias,
recommend_k_sar,
recommend_k_als,
recommend_k_svd,
recommend_k_ncf,
recommend_k_bpr,
recommend_k_bivae,
recommend_k_embdotbias,
recommend_k_lightgcn,
)
I/O Contract
Rating Prediction Functions
| Function | Input: model | Input: test | Output: predictions | Output: Timer |
|---|---|---|---|---|
predict_als |
ALSModel | pyspark.sql.DataFrame | pyspark.sql.DataFrame (with prediction column) | Wall-clock time |
predict_svd |
surprise.SVD | pd.DataFrame | pd.DataFrame (userID, itemID, prediction) | Wall-clock time |
predict_embdotbias |
EmbeddingDotBias | pd.DataFrame (str-typed user/item) | pd.DataFrame (userID, itemID, prediction) | Wall-clock time |
Top-K Recommendation Functions
| Function | Input: model | Input: test | Input: train | Input: top_k | Input: remove_seen | Output |
|---|---|---|---|---|---|---|
recommend_k_sar |
SAR | pd.DataFrame | pd.DataFrame (unused) | int (default 10) | bool (default True) | (pd.DataFrame, Timer) |
recommend_k_als |
ALSModel | pyspark.sql.DataFrame | pyspark.sql.DataFrame | int (default 10) | bool (default True) | (pyspark.sql.DataFrame, Timer) |
recommend_k_svd |
surprise.SVD | pd.DataFrame | pd.DataFrame | int (default 10) | bool (default True) | (pd.DataFrame, Timer) |
recommend_k_ncf |
NCF | pd.DataFrame | pd.DataFrame | int (default 10) | bool (default True) | (pd.DataFrame, Timer) |
recommend_k_bpr |
BPR | pd.DataFrame | pd.DataFrame | int (default 10) | bool (default True) | (pd.DataFrame, Timer) |
recommend_k_bivae |
BiVAECF | pd.DataFrame | pd.DataFrame | int (default 10) | bool (default True) | (pd.DataFrame, Timer) |
recommend_k_embdotbias |
EmbeddingDotBias | pd.DataFrame (str) | pd.DataFrame (str) | int (default 10) | bool (default True) | (pd.DataFrame, Timer) |
recommend_k_lightgcn |
LightGCN | pd.DataFrame | pd.DataFrame (unused) | int (default 10) | bool (default True) | (pd.DataFrame, Timer) |
Usage Examples
from benchmark_utils import *
# Build dispatch dictionaries
rating_predictor = {
"als": lambda model, test: predict_als(model, test),
"svd": lambda model, test: predict_svd(model, test),
"embdotbias": lambda model, test: predict_embdotbias(model, test),
}
ranking_predictor = {
"als": lambda model, test, train: recommend_k_als(model, test, train),
"sar": lambda model, test, train: recommend_k_sar(model, test, train),
"svd": lambda model, test, train: recommend_k_svd(model, test, train),
"ncf": lambda model, test, train: recommend_k_ncf(model, test, train),
"bpr": lambda model, test, train: recommend_k_bpr(model, test, train),
"bivae": lambda model, test, train: recommend_k_bivae(model, test, train),
"embdotbias": lambda model, test, train: recommend_k_embdotbias(model, test, train),
"lightgcn": lambda model, test, train: recommend_k_lightgcn(model, test, train),
}
# In the benchmark loop:
if "rating" in metrics[algo]:
preds, time_rating = rating_predictor[algo](model, test)
if "ranking" in metrics[algo]:
top_k_scores, time_ranking = ranking_predictor[algo](model, test, train)