Implementation:Gretelai Gretel synthetics DGAN Generate Numpy
| Knowledge Sources | |
|---|---|
| Domains | Synthetic_Data, Time_Series, GAN |
| Last Updated | 2026-02-14 19:00 GMT |
Overview
Concrete tool for generating synthetic time series data from a trained DGAN model provided by the gretel-synthetics library.
Description
The DGAN.generate_numpy() method produces synthetic data by sampling noise, running the Generator in evaluation mode, and inverse-transforming the outputs back to the original data space. It generates data in batches of config.batch_size, concatenates results, and truncates to the requested count n.
Internally, _generate() sets the model to eval mode and calls self.generator(attribute_noise, feature_noise), returning a tuple of numpy arrays via .cpu().detach().numpy().
The inverse transformation pipeline then applies inverse_transform_attributes() to decode attributes from their internal representation, and inverse_transform_features() to decode features. For continuous features with per-example scaling, the additional attributes (midpoint, half-range) are used to reverse the per-example normalization before global inverse scaling. For discrete features, one-hot encoding is inverted via argmax and binary encoding is inverted via thresholding at 0.5.
The companion generate_dataframe() method calls generate_numpy() and then uses the stored data_frame_converter to reconstruct a DataFrame matching the original training format.
Usage
Call after training. Specify either n (number of examples) or explicit noise tensors for controlled generation.
Code Reference
Source Location
- Repository: gretel-synthetics
- File:
src/gretel_synthetics/timeseries_dgan/dgan.py - Lines: 538-638 (generate_numpy), 640-663 (generate_dataframe), 958-981 (_generate)
- File:
src/gretel_synthetics/timeseries_dgan/transformations.py - Lines: 712-735 (inverse_transform_attributes), 738-812 (inverse_transform_features)
Signature
def generate_numpy(
self,
n: Optional[int] = None,
attribute_noise: Optional[torch.Tensor] = None,
feature_noise: Optional[torch.Tensor] = None,
) -> AttributeFeaturePair:
def generate_dataframe(
self,
n: Optional[int] = None,
attribute_noise: Optional[torch.Tensor] = None,
feature_noise: Optional[torch.Tensor] = None,
) -> pd.DataFrame:
def _generate(
self, attribute_noise: torch.Tensor, feature_noise: torch.Tensor
) -> NumpyArrayTriple:
Import
from gretel_synthetics.timeseries_dgan.dgan import DGAN
I/O Contract
Inputs (generate_numpy)
| Name | Type | Required | Description |
|---|---|---|---|
| n | Optional[int] | No | Number of synthetic examples to generate; provide either n or both noise tensors |
| attribute_noise | Optional[torch.Tensor] | No | Custom attribute noise tensor of shape (n, attribute_noise_dim) |
| feature_noise | Optional[torch.Tensor] | No | Custom feature noise tensor of shape (n, max_sequence_len/sample_len, feature_noise_dim) |
Inputs (generate_dataframe)
| Name | Type | Required | Description |
|---|---|---|---|
| n | Optional[int] | No | Number of synthetic examples to generate |
| attribute_noise | Optional[torch.Tensor] | No | Custom attribute noise tensor |
| feature_noise | Optional[torch.Tensor] | No | Custom feature noise tensor |
Outputs (generate_numpy)
| Name | Type | Description |
|---|---|---|
| attributes | Optional[np.ndarray] | 2D array of shape (n, num_attributes) in original data space; None if model has no attributes |
| features | list[np.ndarray] | List of n 2D arrays, each of shape (max_sequence_len, num_features) in original data space |
Outputs (generate_dataframe)
| Name | Type | Description |
|---|---|---|
| df | pd.DataFrame | DataFrame in the same format as the training DataFrame, with attribute columns, feature columns, and optionally example_id and time columns |
Usage Examples
Basic Example
import numpy as np
from gretel_synthetics.timeseries_dgan.dgan import DGAN
from gretel_synthetics.timeseries_dgan.config import DGANConfig
# Assume model is already trained
config = DGANConfig(max_sequence_len=20, sample_len=5, batch_size=1000, epochs=10)
model = DGAN(config)
attributes = np.random.rand(10000, 3)
features = np.random.rand(10000, 20, 2)
model.train_numpy(attributes=attributes, features=features)
# Generate 1000 synthetic examples
synthetic_attributes, synthetic_features = model.generate_numpy(1000)
# synthetic_attributes.shape == (1000, 3)
# len(synthetic_features) == 1000
# synthetic_features[0].shape == (20, 2)
DataFrame Generation Example
# After training with train_dataframe
synthetic_df = model.generate_dataframe(500)
Custom Noise Example
import torch
# Generate with specific noise for reproducibility
torch.manual_seed(42)
attr_noise = torch.randn(100, config.attribute_noise_dim)
feat_noise = torch.randn(
100,
config.max_sequence_len // config.sample_len,
config.feature_noise_dim,
)
synthetic_attributes, synthetic_features = model.generate_numpy(
attribute_noise=attr_noise,
feature_noise=feat_noise,
)