Implementation:Gretelai Gretel synthetics Build Model

Knowledge Sources	gretel-synthetics
Domains	Synthetic_Data, Deep_Learning, Recurrent_Neural_Networks
Last Updated	2026-02-14 19:00 GMT

Overview

Concrete tool for constructing the LSTM Sequential model architecture provided by the gretel-synthetics library.

Description

The build_model function is the central dispatcher for constructing the Keras Sequential model used for text generation. It inspects the dp flag on the provided configuration and delegates to one of two builder functions:

build_default_model: Constructs a standard LSTM model with an RMSprop optimizer. The architecture consists of Embedding, three Dropout layers, two stacked stateful LSTM layers, and a Dense output layer.
build_dp_model: Constructs the same architecture but wraps the RMSprop optimizer with TensorFlow Privacy's make_keras_optimizer_class to enable differentially private training with per-example gradient clipping and noise injection. It also patches the Keras LSTM code paths for TF 2.4+ compatibility.

Both builders compile the model with sparse categorical cross-entropy loss (from logits) and accuracy metrics. After building, the model summary is logged for inspection.

Usage

The build_model function is called internally during training (by train_rnn) and during model loading for inference (by load_model via _prepare_model). Users typically do not call it directly; instead, they use the train() facade or generate_text() which handle model construction automatically.

Code Reference

Source Location

Repository: gretel-synthetics
Files:
- src/gretel_synthetics/tensorflow/model.py (L25--39): build_model dispatcher
- src/gretel_synthetics/tensorflow/default_model.py (L21--61): build_default_model
- src/gretel_synthetics/tensorflow/dp_model.py (L37--102): build_dp_model

Signature

build_model (dispatcher):

def build_model(
    vocab_size: int, batch_size: int, store: BaseConfig
) -> tf.keras.Sequential:
    """
    Utilizing tf.keras.Sequential model
    """
    model = None

    if store.dp:
        model = build_dp_model(store, batch_size, vocab_size)
    else:
        model = build_default_model(store, batch_size, vocab_size)

    _print_model_summary(model)
    return model

build_default_model:

def build_default_model(store, batch_size, vocab_size) -> tf.keras.Sequential:
    optimizer = RMSprop(learning_rate=store.learning_rate)

    model = tf.keras.Sequential(
        [
            tf.keras.layers.Embedding(
                vocab_size, store.embedding_dim, batch_input_shape=[batch_size, None]
            ),
            tf.keras.layers.Dropout(store.dropout_rate),
            tf.keras.layers.LSTM(
                store.rnn_units,
                return_sequences=True,
                stateful=True,
                recurrent_initializer=store.rnn_initializer,
            ),
            tf.keras.layers.Dropout(store.dropout_rate),
            tf.keras.layers.LSTM(
                store.rnn_units,
                return_sequences=True,
                stateful=True,
                recurrent_initializer=store.rnn_initializer,
            ),
            tf.keras.layers.Dropout(store.dropout_rate),
            tf.keras.layers.Dense(vocab_size),
        ]
    )

    model.compile(optimizer=optimizer, loss=loss, metrics=["accuracy"])
    return model

build_dp_model:

def build_dp_model(store, batch_size, vocab_size) -> tf.keras.Sequential:
    optimizer = make_keras_optimizer_class(RMSprop)(
        l2_norm_clip=store.dp_l2_norm_clip,
        noise_multiplier=store.dp_noise_multiplier,
        num_microbatches=store.dp_microbatches,
        learning_rate=store.learning_rate,
    )
    # ... same layer architecture as build_default_model ...
    model.compile(optimizer=optimizer, loss=loss, metrics=["accuracy"])
    return model

Import

from gretel_synthetics.tensorflow.model import build_model

I/O Contract

Inputs

Name	Type	Required	Description
vocab_size	int	Yes	Size of the token vocabulary (determines embedding input and dense output dimensions)
batch_size	int	Yes	Number of samples per batch (determines the stateful LSTM batch dimension)
store	BaseConfig	Yes	Configuration object providing embedding_dim, rnn_units, dropout_rate, rnn_initializer, learning_rate, dp flag, and DP-specific parameters

Outputs

Name	Type	Description
model	tf.keras.Sequential	A compiled Keras Sequential model ready for training or weight loading

Usage Examples

Basic Example

from gretel_synthetics.tensorflow.model import build_model
from gretel_synthetics.config import TensorFlowConfig

config = TensorFlowConfig(
    input_data_path="/path/to/data.txt",
    checkpoint_dir="/path/to/model",
)

# Build model with vocabulary of 5000 tokens, batch size 64
model = build_model(vocab_size=5000, batch_size=64, store=config)
model.summary()

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment