Implementation:Recommenders team Recommenders Sequential Iterator
| Knowledge Sources | |
|---|---|
| Domains | Recommendation Systems, Sequential Modeling, Data Loading |
| Last Updated | 2026-02-10 00:00 GMT |
Overview
SequentialIterator is the base data iterator for sequential recommendation models, providing user behavior sequence loading with temporal feature engineering and in-batch negative sampling.
Description
The SequentialIterator class extends BaseIterator and serves as the standard data loading pipeline for all sequential recommendation models in the DeepRec framework, including A2SVD, Caser, GRU, SLI_REC, and SUM.
During initialization, the iterator loads user, item, and category vocabulary dictionaries from pickle files, and creates TensorFlow placeholders for users, items, categories, item/category history sequences, masks, and four temporal features (time, time_diff, time_from_first_action, time_to_now).
The parser_one_line method parses tab-separated lines and computes three log-scaled temporal features: (1) time_diff -- the time difference between consecutive actions normalized by a 24-hour range with a minimum of 0.5; (2) time_from_first_action -- the elapsed time since the first action in the sequence; (3) time_to_now -- the elapsed time from each historical action to the current timestamp. All three features are log-transformed after normalization.
The _convert_data method handles two modes. In training mode (batch_num_ngs > 0), it performs in-batch negative sampling where random items from the same batch serve as negative examples. For each positive instance, batch_num_ngs negative items are sampled, ensuring none match the positive item. In evaluation mode (batch_num_ngs = 0), data is converted directly without negative sampling. History sequences are zero-padded to max_seq_length with a binary mask indicating valid positions.
Parsed data is cached per file in the iter_data dictionary, so repeated calls to load_data_from_file with the same file avoid re-parsing. When negative sampling is active, the data lines are shuffled before each epoch for randomization.
Usage
Use SequentialIterator when training or evaluating sequential recommendation models such as A2SVD, Caser, GRU4Rec, SLI_REC, or SUM. It is also the base class for NextItNetIterator, which overrides its data conversion for sequence-level prediction.
Code Reference
Source Location
- Repository: Recommenders
- File: recommenders/models/deeprec/io/sequential_iterator.py
- Lines: 1-476
Signature
class SequentialIterator(BaseIterator):
def __init__(self, hparams, graph, col_spliter="\t"):
def parse_file(self, input_file):
# Returns: list
def parser_one_line(self, line):
# Returns: (label, user_id, item_id, item_cate,
# item_history_sequence, cate_history_sequence,
# current_time, time_diff, time_from_first_action,
# time_to_now)
def load_data_from_file(self, infile, batch_num_ngs=0, min_seq_length=1):
# Yields: feed_dict or None
def _convert_data(
self,
label_list, user_list, item_list, item_cate_list,
item_history_batch, item_cate_history_batch,
time_list, time_diff_list, time_from_first_action_list,
time_to_now_list, batch_num_ngs,
):
# Returns: dict
def gen_feed_dict(self, data_dict):
# Returns: dict
Import
from recommenders.models.deeprec.io.sequential_iterator import SequentialIterator
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| hparams | object | Yes | Global hyper-parameters with user_vocab, item_vocab, cate_vocab, max_seq_length, and batch_size |
| graph | tf.Graph | Yes | The TensorFlow graph to which all created placeholders will be added |
| col_spliter | str | No | Column separator in one line (default: "\t") |
Inputs (load_data_from_file)
| Name | Type | Required | Description |
|---|---|---|---|
| infile | str | Yes | Text input file path. Each line is a tab-separated instance |
| batch_num_ngs | int | No | Number of negative samples per positive instance in-batch. 0 means no negative sampling (default: 0) |
| min_seq_length | int | No | Minimum sequence length to include an instance (default: 1) |
Outputs
| Name | Type | Description |
|---|---|---|
| load_data_from_file() | generator | Yields feed_dict objects for each mini-batch, or None if batch is too small during training |
| parser_one_line() | tuple | Returns 10-element tuple: (label, user_id, item_id, item_cate, item_history_sequence, cate_history_sequence, current_time, time_diff, time_from_first_action, time_to_now) |
| parse_file() | list | Returns a list of all parsed lines from the input file |
TensorFlow Placeholders
| Placeholder | Shape | Type | Description |
|---|---|---|---|
| labels | [None, 1] | tf.float32 | Ground-truth labels |
| users | [None] | tf.int32 | User indices |
| items | [None] | tf.int32 | Target item indices |
| cates | [None] | tf.int32 | Target item category indices |
| item_history | [None, max_seq_length] | tf.int32 | Item history sequence, zero-padded |
| item_cate_history | [None, max_seq_length] | tf.int32 | Category history sequence, zero-padded |
| mask | [None, max_seq_length] | tf.int32 | Binary mask for valid history positions (1.0 = valid, 0.0 = padding) |
| time | [None] | tf.float32 | Current action timestamp |
| time_diff | [None, max_seq_length] | tf.float32 | Log-scaled time differences between consecutive actions |
| time_from_first_action | [None, max_seq_length] | tf.float32 | Log-scaled time from the first action in the sequence |
| time_to_now | [None, max_seq_length] | tf.float32 | Log-scaled time from each historical action to current time |
Input Data Format
Each line in the input file is tab-separated with the following fields:
| Position | Field | Description |
|---|---|---|
| 0 | label | Binary label (0 or 1) |
| 1 | user_hash | User identifier (mapped via user_vocab) |
| 2 | item_hash | Target item identifier (mapped via item_vocab) |
| 3 | item_cate | Target item category (mapped via cate_vocab) |
| 4 | operation_time | Current action timestamp (float) |
| 5 | item_history | Comma-separated item history sequence |
| 6 | item_cate_history | Comma-separated category history sequence |
| 7 | time_history | Comma-separated timestamp history sequence |
Temporal Feature Engineering
The iterator computes three temporal features from the raw timestamp history:
- time_diff: For each pair of consecutive actions, compute
(t[i+1] - t[i]) / 86400, clamp to minimum 0.5, then apply log transform - time_from_first_action: For each action, compute
(t[i] - t[0]) / 86400, clamp to minimum 0.5, then apply log transform - time_to_now: For each historical action, compute
(t_current - t[i]) / 86400, clamp to minimum 0.5, then apply log transform
All three features normalize by a 24-hour range (86400 seconds), apply a floor of 0.5 to avoid log(0), and are log-transformed.
Usage Examples
Basic Usage
import tensorflow as tf
from recommenders.models.deeprec.io.sequential_iterator import SequentialIterator
# hparams must include:
# hparams.user_vocab = "user_vocab.pkl"
# hparams.item_vocab = "item_vocab.pkl"
# hparams.cate_vocab = "cate_vocab.pkl"
# hparams.max_seq_length = 50
# hparams.batch_size = 128
graph = tf.Graph()
iterator = SequentialIterator(hparams, graph)
# Training with in-batch negative sampling (4 negatives per positive)
train_file = "train_data.tsv"
for batch_input in iterator.load_data_from_file(train_file, batch_num_ngs=4):
if batch_input is not None:
# batch_input is a feed_dict ready for sess.run()
# Effective batch size is batch_size * (1 + batch_num_ngs)
pass
# Evaluation without negative sampling
test_file = "test_data.tsv"
for batch_input in iterator.load_data_from_file(test_file, batch_num_ngs=0):
if batch_input is not None:
# batch_input contains labels, items, and history for evaluation
pass
Filtering Short Sequences
import tensorflow as tf
from recommenders.models.deeprec.io.sequential_iterator import SequentialIterator
graph = tf.Graph()
iterator = SequentialIterator(hparams, graph)
# Only include users with at least 5 interactions in their history
train_file = "train_data.tsv"
for batch_input in iterator.load_data_from_file(
train_file, batch_num_ngs=4, min_seq_length=5
):
if batch_input is not None:
pass