Implementation:Recommenders team Recommenders BaseModel Run Eval
| Knowledge Sources | |
|---|---|
| Domains | News Recommendation, Evaluation Metrics |
| Last Updated | 2026-02-10 00:00 GMT |
Overview
Concrete tool for evaluating a trained news recommendation model on validation or test data, computing impression-level AUC, MRR, NDCG@5, and NDCG@10 metrics.
Description
BaseModel.run_eval is the primary evaluation entry point for all neural news recommendation models that inherit from BaseModel (including NRMSModel, NAMLModel, LSTURModel, and NPAModel). It performs the following:
- Mode selection — Checks
self.support_quick_scoringto decide between fast and slow evaluation:- If
True, delegates torun_fast_evalwhich pre-computes news and user embeddings. - If
False, delegates torun_slow_evalwhich runs the full scorer per impression.
- If
- Metric computation — Passes the grouped labels and predictions to
cal_metricfromdeeprec_utils, which computes the metrics specified inself.hparams.metrics. - Result return — Returns a dictionary mapping metric names to their computed values.
The slow evaluation path iterates over all test data batches, collects per-sample predictions, labels, and impression indices, then groups them. The fast evaluation path encodes all news articles and users once, then computes scores via numpy dot products.
Usage
Call run_eval after training to assess model quality. It is also called automatically at the end of each epoch by fit() to report validation metrics.
Code Reference
Source Location
- Repository: recommenders-team/recommenders
- File:
recommenders/models/newsrec/models/base_model.py(lines 321-340)
Signature
def run_eval(self, news_filename: str, behaviors_file: str) -> dict:
"""Evaluate the given file and returns some evaluation metrics.
Args:
news_filename (str): Path to the news metadata file (news.tsv).
behaviors_file (str): Path to the user behaviors file (behaviors.tsv).
Returns:
dict: A dictionary containing evaluation metrics
(e.g., {"group_auc": 0.67, "mean_mrr": 0.33, "ndcg@5": 0.36, "ndcg@10": 0.42}).
"""
Import
# Accessed via the NRMSModel class (inherits from BaseModel)
from recommenders.models.newsrec.models.nrms import NRMSModel
from recommenders.models.newsrec.io.mind_iterator import MINDIterator
model = NRMSModel(hparams, MINDIterator, seed=42)
# model.run_eval(...) is inherited from BaseModel
I/O Contract
| Parameter | Type | Description |
|---|---|---|
news_filename |
str |
Path to the news.tsv file containing news article metadata |
behaviors_file |
str |
Path to the behaviors.tsv file containing user impression logs |
| Return | Type | Description |
|---|---|---|
res |
dict |
Dictionary of metric name to value, e.g., {"group_auc": 0.67, "mean_mrr": 0.33, "ndcg@5": 0.36, "ndcg@10": 0.42}
|
Usage Examples
import os
# After training the model
valid_news_file = os.path.join(data_path, "valid", "news.tsv")
valid_behaviors_file = os.path.join(data_path, "valid", "behaviors.tsv")
# Run evaluation
eval_results = model.run_eval(valid_news_file, valid_behaviors_file)
# Print results
for metric, value in sorted(eval_results.items()):
print(f"{metric}: {value:.4f}")
# Example output:
# group_auc: 0.6713
# mean_mrr: 0.3298
# ndcg@10: 0.4231
# ndcg@5: 0.3612
Dependencies
tensorflow— For running model inference during slow evaluationnumpy— For array operations and grouping predictionsrecommenders.models.deeprec.deeprec_utils.cal_metric— Computes AUC, MRR, NDCG, and other ranking metrics from grouped predictions
Related Pages
Implements Principle
Requires Environment
- Environment:Recommenders_team_Recommenders_Python_Core_Dependencies
- Environment:Recommenders_team_Recommenders_GPU_CUDA_Environment