Implementation:Recommenders team Recommenders LSTUR Model
| Knowledge Sources | |
|---|---|
| Domains | News Recommendation, Deep Learning, User Modeling |
| Last Updated | 2026-02-10 00:00 GMT |
Overview
The LSTURModel implements the LSTUR (Neural News Recommendation with Long- and Short-term User Representations) model, which combines GRU-based short-term user interests with user embedding-based long-term preferences for news recommendation.
Description
LSTURModel extends BaseModel to implement the LSTUR architecture from An et al. (ACL 2019). The model is built around two core encoders: a news encoder and a user encoder that explicitly models both long-term and short-term user representations.
The news encoder takes word indices from a news title, passes them through a pretrained word embedding layer (loaded from a word2vec file), applies dropout for regularization, processes through a 1D CNN with configurable filter count, window size, and activation, applies additional dropout, then uses masking layers (ComputeMasking and OverwriteMasking) to handle padding tokens, and finally aggregates via AttLayer2 (additive attention) to produce a fixed-dimension news representation vector.
The user encoder is the distinguishing component of LSTUR. It models long-term user preferences through a learnable user embedding layer indexed by user ID, and short-term interests through a GRU (Gated Recurrent Unit) that processes the sequence of recently clicked news articles (encoded via the news encoder with TimeDistributed wrapping). The model supports two modes for combining these representations, controlled by the type hyperparameter:
- ini mode: The user embedding serves as the initial hidden state of the GRU, and the final GRU output becomes the user representation.
- con mode: The GRU processes clicked news independently, and its output is concatenated with the user embedding, then projected through a dense layer to produce the final user representation.
Click probability is computed as a dot product between candidate news and user representations. During training, softmax is applied over the candidate set (one positive plus npratio negatives). During inference, sigmoid is applied to produce individual scores.
Usage
Use LSTURModel when building a news recommendation system where explicitly modeling both long-term user preferences (stable interests tied to user identity) and short-term browsing patterns (recent click sequences) is important. It is particularly effective for scenarios where users have established profiles with browsing history, and works with the standard MINDIterator since it only requires title-level features.
Code Reference
Source Location
- Repository: Recommenders
- File: recommenders/models/newsrec/models/lstur.py
- Lines: 1-212
Signature
class LSTURModel(BaseModel):
def __init__(self, hparams, iterator_creator, seed=None)
def _get_input_label_from_iter(self, batch_data)
def _get_user_feature_from_iter(self, batch_data)
def _get_news_feature_from_iter(self, batch_data)
def _build_graph(self)
def _build_userencoder(self, titleencoder, type="ini")
def _build_newsencoder(self, embedding_layer)
def _build_lstur(self)
Import
from recommenders.models.newsrec.models.lstur import LSTURModel
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| hparams | object | Yes | Global hyper-parameters including word_emb_dim, title_size, his_size, filter_num, window_size, cnn_activation, dropout, gru_unit, attention_hidden_dim, type ("ini" or "con"), npratio, and wordEmb_file. |
| iterator_creator | object | Yes | Data iterator creator class (e.g., MINDIterator) for constructing train/test data loaders. |
| seed | int | No | Random seed for reproducibility of weight initialization. |
| batch_data | dict | Yes (at runtime) | Dictionary containing "user_index_batch" (int32), "clicked_title_batch" (int64), "candidate_title_batch" (int64), and "labels" (float32). |
Outputs
| Name | Type | Description |
|---|---|---|
| model | keras.Model | Training model that takes [user_indexes, his_input_title, pred_input_title] and outputs softmax probabilities over candidates. |
| scorer | keras.Model | Inference model that takes [user_indexes, his_input_title, pred_input_title_one] and outputs a sigmoid score for a single candidate. |
| self.newsencoder | keras.Model | The news encoder sub-model, exposed for separate news encoding during inference. |
| self.userencoder | keras.Model | The user encoder sub-model, exposed for separate user encoding during inference. |
Usage Examples
Basic Usage
from recommenders.models.newsrec.models.lstur import LSTURModel
from recommenders.models.newsrec.io.mind_iterator import MINDIterator
from recommenders.models.newsrec.newsrec_utils import prepare_hparams
# Prepare hyperparameters from a YAML config
hparams = prepare_hparams(
yaml_file,
wordEmb_file=word_embedding_path,
wordDict_file=word_dict_path,
userDict_file=user_dict_path,
type="ini", # "ini" uses user emb as GRU initial state
gru_unit=400,
npratio=4,
his_size=50,
batch_size=32,
)
# Create the LSTUR model
model = LSTURModel(hparams, MINDIterator, seed=42)
# Train the model
model.fit(train_news_file, train_behaviors_file, valid_news_file, valid_behaviors_file)
# Evaluate on test data
results = model.run_eval(test_news_file, test_behaviors_file)
print(results) # e.g., {"group_auc": 0.67, "ndcg@5": 0.38, ...}