Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Recommenders team Recommenders Sequential Iterator

From Leeroopedia


Knowledge Sources
Domains Recommendation Systems, Sequential Modeling, Data Loading
Last Updated 2026-02-10 00:00 GMT

Overview

SequentialIterator is the base data iterator for sequential recommendation models, providing user behavior sequence loading with temporal feature engineering and in-batch negative sampling.

Description

The SequentialIterator class extends BaseIterator and serves as the standard data loading pipeline for all sequential recommendation models in the DeepRec framework, including A2SVD, Caser, GRU, SLI_REC, and SUM.

During initialization, the iterator loads user, item, and category vocabulary dictionaries from pickle files, and creates TensorFlow placeholders for users, items, categories, item/category history sequences, masks, and four temporal features (time, time_diff, time_from_first_action, time_to_now).

The parser_one_line method parses tab-separated lines and computes three log-scaled temporal features: (1) time_diff -- the time difference between consecutive actions normalized by a 24-hour range with a minimum of 0.5; (2) time_from_first_action -- the elapsed time since the first action in the sequence; (3) time_to_now -- the elapsed time from each historical action to the current timestamp. All three features are log-transformed after normalization.

The _convert_data method handles two modes. In training mode (batch_num_ngs > 0), it performs in-batch negative sampling where random items from the same batch serve as negative examples. For each positive instance, batch_num_ngs negative items are sampled, ensuring none match the positive item. In evaluation mode (batch_num_ngs = 0), data is converted directly without negative sampling. History sequences are zero-padded to max_seq_length with a binary mask indicating valid positions.

Parsed data is cached per file in the iter_data dictionary, so repeated calls to load_data_from_file with the same file avoid re-parsing. When negative sampling is active, the data lines are shuffled before each epoch for randomization.

Usage

Use SequentialIterator when training or evaluating sequential recommendation models such as A2SVD, Caser, GRU4Rec, SLI_REC, or SUM. It is also the base class for NextItNetIterator, which overrides its data conversion for sequence-level prediction.

Code Reference

Source Location

Signature

class SequentialIterator(BaseIterator):
    def __init__(self, hparams, graph, col_spliter="\t"):

    def parse_file(self, input_file):
        # Returns: list

    def parser_one_line(self, line):
        # Returns: (label, user_id, item_id, item_cate,
        #           item_history_sequence, cate_history_sequence,
        #           current_time, time_diff, time_from_first_action,
        #           time_to_now)

    def load_data_from_file(self, infile, batch_num_ngs=0, min_seq_length=1):
        # Yields: feed_dict or None

    def _convert_data(
        self,
        label_list, user_list, item_list, item_cate_list,
        item_history_batch, item_cate_history_batch,
        time_list, time_diff_list, time_from_first_action_list,
        time_to_now_list, batch_num_ngs,
    ):
        # Returns: dict

    def gen_feed_dict(self, data_dict):
        # Returns: dict

Import

from recommenders.models.deeprec.io.sequential_iterator import SequentialIterator

I/O Contract

Inputs

Name Type Required Description
hparams object Yes Global hyper-parameters with user_vocab, item_vocab, cate_vocab, max_seq_length, and batch_size
graph tf.Graph Yes The TensorFlow graph to which all created placeholders will be added
col_spliter str No Column separator in one line (default: "\t")

Inputs (load_data_from_file)

Name Type Required Description
infile str Yes Text input file path. Each line is a tab-separated instance
batch_num_ngs int No Number of negative samples per positive instance in-batch. 0 means no negative sampling (default: 0)
min_seq_length int No Minimum sequence length to include an instance (default: 1)

Outputs

Name Type Description
load_data_from_file() generator Yields feed_dict objects for each mini-batch, or None if batch is too small during training
parser_one_line() tuple Returns 10-element tuple: (label, user_id, item_id, item_cate, item_history_sequence, cate_history_sequence, current_time, time_diff, time_from_first_action, time_to_now)
parse_file() list Returns a list of all parsed lines from the input file

TensorFlow Placeholders

Placeholder Shape Type Description
labels [None, 1] tf.float32 Ground-truth labels
users [None] tf.int32 User indices
items [None] tf.int32 Target item indices
cates [None] tf.int32 Target item category indices
item_history [None, max_seq_length] tf.int32 Item history sequence, zero-padded
item_cate_history [None, max_seq_length] tf.int32 Category history sequence, zero-padded
mask [None, max_seq_length] tf.int32 Binary mask for valid history positions (1.0 = valid, 0.0 = padding)
time [None] tf.float32 Current action timestamp
time_diff [None, max_seq_length] tf.float32 Log-scaled time differences between consecutive actions
time_from_first_action [None, max_seq_length] tf.float32 Log-scaled time from the first action in the sequence
time_to_now [None, max_seq_length] tf.float32 Log-scaled time from each historical action to current time

Input Data Format

Each line in the input file is tab-separated with the following fields:

Position Field Description
0 label Binary label (0 or 1)
1 user_hash User identifier (mapped via user_vocab)
2 item_hash Target item identifier (mapped via item_vocab)
3 item_cate Target item category (mapped via cate_vocab)
4 operation_time Current action timestamp (float)
5 item_history Comma-separated item history sequence
6 item_cate_history Comma-separated category history sequence
7 time_history Comma-separated timestamp history sequence

Temporal Feature Engineering

The iterator computes three temporal features from the raw timestamp history:

  1. time_diff: For each pair of consecutive actions, compute (t[i+1] - t[i]) / 86400, clamp to minimum 0.5, then apply log transform
  2. time_from_first_action: For each action, compute (t[i] - t[0]) / 86400, clamp to minimum 0.5, then apply log transform
  3. time_to_now: For each historical action, compute (t_current - t[i]) / 86400, clamp to minimum 0.5, then apply log transform

All three features normalize by a 24-hour range (86400 seconds), apply a floor of 0.5 to avoid log(0), and are log-transformed.

Usage Examples

Basic Usage

import tensorflow as tf
from recommenders.models.deeprec.io.sequential_iterator import SequentialIterator

# hparams must include:
#   hparams.user_vocab = "user_vocab.pkl"
#   hparams.item_vocab = "item_vocab.pkl"
#   hparams.cate_vocab = "cate_vocab.pkl"
#   hparams.max_seq_length = 50
#   hparams.batch_size = 128

graph = tf.Graph()
iterator = SequentialIterator(hparams, graph)

# Training with in-batch negative sampling (4 negatives per positive)
train_file = "train_data.tsv"
for batch_input in iterator.load_data_from_file(train_file, batch_num_ngs=4):
    if batch_input is not None:
        # batch_input is a feed_dict ready for sess.run()
        # Effective batch size is batch_size * (1 + batch_num_ngs)
        pass

# Evaluation without negative sampling
test_file = "test_data.tsv"
for batch_input in iterator.load_data_from_file(test_file, batch_num_ngs=0):
    if batch_input is not None:
        # batch_input contains labels, items, and history for evaluation
        pass

Filtering Short Sequences

import tensorflow as tf
from recommenders.models.deeprec.io.sequential_iterator import SequentialIterator

graph = tf.Graph()
iterator = SequentialIterator(hparams, graph)

# Only include users with at least 5 interactions in their history
train_file = "train_data.tsv"
for batch_input in iterator.load_data_from_file(
    train_file, batch_num_ngs=4, min_seq_length=5
):
    if batch_input is not None:
        pass

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment