Heuristic:Recommenders team Recommenders TensorFlow Session Ordering

Knowledge Sources	Code comment in base_model.py
Domains	Deep_Learning, Debugging, TensorFlow
Last Updated	2026-02-10 00:00 GMT

Overview

Critical ordering constraint: Keras models must be built after setting the TensorFlow session, or model weights become unavailable in threaded execution.

Description

When using TensorFlow 2.x in `tf.compat.v1` graph execution mode (which all Recommenders TF models use), the Keras backend session must be configured before any model construction. The TF session is created with `GPUOptions(allow_growth=True)` for dynamic GPU memory allocation, then registered as the Keras default session via `tf.compat.v1.keras.backend.set_session()`. Only after this step should `_build_graph()` be called to construct the model. Reversing this order causes model weights to silently become unavailable in multi-threaded contexts.

Usage

Apply this heuristic whenever initializing a TensorFlow-based model in the Recommenders library (NRMS, NAML, LSTUR, NPA, or any DeepRec model). This applies to both the newsrec `BaseModel` and the deeprec `BaseModel`. If you are writing a new TF-based model class, follow the same session-then-build pattern.

The Insight (Rule of Thumb)

Action: Always call `tf.compat.v1.keras.backend.set_session(sess)` before calling `self._build_graph()`.
Value: The session must use `GPUOptions(allow_growth=True)` to avoid pre-allocating all GPU memory.
Trade-off: None. This is a correctness requirement, not a performance trade-off.
Failure mode: Reversing the order causes model weights to be unavailable in threads after the session is set, leading to silent failures or incorrect predictions.

Reasoning

TensorFlow 1.x-style sessions bind variables to a specific session context. When Keras creates model weights in the default session and a new session is subsequently set, the original weights become unreachable from the new session's thread context. The Recommenders codebase explicitly documents this as an `IMPORTANT` comment in the source code, indicating it was discovered through debugging production failures.

The `allow_growth=True` GPU option is also critical: without it, TensorFlow pre-allocates all available GPU memory, preventing PyTorch models or other processes from using the GPU concurrently.

Code evidence from `recommenders/models/newsrec/models/base_model.py:61-72`:

# set GPU use with on demand growth
gpu_options = tf.compat.v1.GPUOptions(allow_growth=True)
sess = tf.compat.v1.Session(
    config=tf.compat.v1.ConfigProto(gpu_options=gpu_options)
)

# set this TensorFlow session as the default session for Keras
tf.compat.v1.keras.backend.set_session(sess)

# IMPORTANT: models have to be loaded AFTER SETTING THE SESSION for keras!
# Otherwise, their weights will be unavailable in the threads after the session there has been set
self.model, self.scorer = self._build_graph()

The same pattern is used in `recommenders/models/deeprec/models/base_model.py:69-72` and `recommenders/models/ncf/ncf_singlenode.py:85-89`.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment