Principle:Recommenders team Recommenders News Model Training
| Knowledge Sources | |
|---|---|
| Domains | News Recommendation, Model Training, Deep Learning |
| Last Updated | 2026-02-10 00:00 GMT |
Overview
Training neural news recommendation models follows an epoch-based loop with per-epoch validation, where each training step computes impression-level loss over batches constructed with negative sampling (controlled by the npratio hyperparameter).
Description
The training principle for neural news recommenders encompasses the full training loop lifecycle:
Epoch-Based Training:
- For each epoch (1 to
hparams.epochs):- Load training data from the news and behaviors files via the train iterator.
- The train iterator applies negative sampling with ratio
npratio: for each positive (clicked) candidate,nprationegative (non-clicked) candidates are sampled from the same impression. - For each mini-batch:
- Extract input features (clicked title batch and candidate title batch) and labels.
- Run one gradient update step using
model.train_on_batch. - Accumulate the batch loss into the epoch loss.
- Periodically log the running loss (every
show_stepsteps).
- After all batches, compute the epoch training time.
Per-Epoch Validation:
- At the end of each epoch, run
run_evalon the validation set. - Log the validation metrics (AUC, MRR, NDCG@5, NDCG@10).
- Optionally, if test files are provided, also evaluate on the test set.
Loss Function:
- For NRMS, the loss is categorical cross-entropy over the softmax of
npratio + 1candidate scores, where the positive candidate is the target class. - This is equivalent to a listwise ranking loss that encourages the model to assign the highest probability to the clicked article.
Training Control:
- An optional
step_limitparameter allows early termination within an epoch (useful for debugging or quick experiments). - Progress is displayed via a tqdm progress bar showing step count, total loss, and data loss.
Usage
Use model training after dataset preparation, hyperparameter configuration, and model initialization. The fit() method is the standard entry point that handles the full training and validation loop. Training produces a model ready for final evaluation or prediction export.
Theoretical Basis
Negative Sampling
For each impression in training data: positive_candidate = clicked article negative_candidates = sample(npratio articles from non-clicked articles) If fewer non-clicked articles than npratio, pad with zeros. Input to model: [positive, neg_1, neg_2, ..., neg_npratio] Label: [1, 0, 0, ..., 0] (one-hot with positive at index 0)
Training Loss
scores = dot(candidate_news_vectors, user_vector) # shape: (npratio + 1,) probabilities = softmax(scores) loss = categorical_cross_entropy(one_hot_label, probabilities) The loss encourages: score(clicked_article) >> score(non_clicked_articles)
Training Loop Pseudocode
for epoch in range(1, epochs + 1):
epoch_loss = 0
step = 0
for batch in train_iterator.load_data(train_news, train_behaviors):
if step_limit and step >= step_limit:
break
loss = model.train_on_batch(batch_input, batch_label)
epoch_loss += loss
step += 1
# Per-epoch validation
eval_results = run_eval(valid_news, valid_behaviors)
print(f"Epoch {epoch}: loss={epoch_loss/step:.4f}, AUC={eval_results['group_auc']:.4f}")