Workflow:Recommenders team Recommenders News Recommendation NRMS

Knowledge Sources	Recommenders NRMS Paper MIND Dataset
Domains	Recommendation_Systems, Content_Based_Filtering, News_Recommendation, Deep_Learning
Last Updated	2026-02-09 23:00 GMT

Overview

End-to-end process for building a neural news recommendation system using NRMS (Neural News Recommendation with Multi-Head Self-Attention) on the MIND dataset.

Description

This workflow implements a content-based neural news recommendation pipeline using the NRMS architecture. NRMS uses multi-head self-attention mechanisms to learn both news representations (from title words) and user representations (from browsing history). Unlike collaborative filtering approaches, NRMS leverages the textual content of news articles and models user reading sequences to capture evolving interests. The pipeline covers MIND dataset download, GloVe word embedding preparation, hyperparameter configuration via YAML, model training with Keras, and evaluation using news recommendation-specific metrics (group AUC, MRR, NDCG).

Usage

Execute this workflow when you need to build a news or article recommendation system that incorporates textual content understanding. It is appropriate when you have a dataset with article content (titles, bodies) and user reading history (behavior logs). NRMS is particularly suited for news platforms, content aggregators, or any scenario where items are text-heavy and user interests evolve rapidly over time. The MIND dataset provides a standardized benchmark for development and evaluation.

Execution Steps

Step 1: Dataset Download and Preparation

Download the MIND (Microsoft News Dataset) training and validation sets along with required utility files. The MIND dataset contains news articles with titles, abstracts, categories, and subcategories, plus user behavior logs recording impressions and clicks. Also download the pre-trained GloVe word embeddings, word dictionary, and user dictionary needed for text encoding.

Key considerations:

MIND comes in demo, small, and large variants for different scales
The utility files (embedding, word dict, user dict) are dataset-specific
The YAML configuration file defines the full model architecture

Step 2: Hyperparameter Configuration

Load and configure model hyperparameters from a YAML configuration file with runtime overrides. Key parameters include word embedding dimensions, attention head count, attention hidden dimensions, title word count limit, user history length, batch size, learning rate, and number of training epochs.

Key considerations:

Word embedding dimension must match the pre-trained GloVe vectors
head_num and head_dim control the multi-head self-attention capacity
title_size determines how many words from each title are used
his_size controls how many recent articles from browsing history are considered

Step 3: Model Initialization

Instantiate the NRMS model with the configured hyperparameters and a MIND data iterator. The model builds two sub-networks: a news encoder that processes article titles through word embeddings and multi-head self-attention to produce news representations, and a user encoder that processes a user's reading history through another multi-head self-attention layer to produce user representations.

What happens internally:

News encoder: word embedding layer, multi-head self-attention, additive attention
User encoder: multi-head self-attention over browsed news representations, additive attention
Click prediction: dot product between user and candidate news representations

Step 4: Baseline Evaluation

Evaluate the randomly-initialized model on the validation set before any training. This establishes baseline metrics that demonstrate the model's starting point and allows quantification of improvement from training. Metrics include group AUC, mean MRR, NDCG@5, and NDCG@10.

Key considerations:

Baseline scores should be near random for an untrained model
This step validates that the data pipeline and model architecture are correctly configured

Step 5: Model Training

Train the NRMS model on the MIND training set with periodic validation evaluation. The training process feeds batches of user behavior impressions through the model, computing cross-entropy loss between predicted and actual click labels. The model learns to align news representations of clicked articles closer to the user representation than non-clicked candidates.

Key considerations:

Each epoch processes all training impressions
Validation metrics are computed after each epoch to track progress
Early stopping or best-checkpoint selection based on validation AUC is recommended
Training benefits significantly from GPU acceleration

Step 6: Final Evaluation and Prediction Export

Evaluate the trained model on the validation set to measure final performance. Optionally, use the fast evaluation mode that pre-computes news embeddings for efficient scoring. Generate prediction scores for submission or deployment, and optionally export results in the MIND competition format as a ZIP file containing ranked recommendation lists.

Key considerations:

Fast evaluation pre-computes news vectors once and reuses them across users
The competition format requires specific output formatting (impression ID, ranked indices)
Model weights can be saved for later inference without retraining

Execution Diagram

GitHub URL

Workflow Repository