Implementation:Sdv dev SDV CTGANSynthesizer Init
Appearance
| Knowledge Sources | |
|---|---|
| Domains | Deep_Learning, Synthetic_Data |
| Last Updated | 2026-02-14 00:00 GMT |
Overview
Concrete tool for creating a CTGAN-based synthesizer for single-table synthetic data generation, provided by the SDV library.
Description
The CTGANSynthesizer wraps the ctgan.CTGAN model. It provides configurable neural network architecture (embedding, generator, and discriminator dimensions), training hyperparameters (learning rates, epochs, batch size), and GPU support.
Usage
Import this class when you need to generate single-table synthetic data using a deep learning approach, particularly for datasets with complex inter-column relationships.
Code Reference
Source Location
- Repository: SDV
- File: sdv/single_table/ctgan.py
- Lines: L109-229
Signature
class CTGANSynthesizer(LossValuesMixin, MissingModuleMixin, BaseSingleTableSynthesizer):
def __init__(
self,
metadata,
enforce_min_max_values=True,
enforce_rounding=True,
locales=['en_US'],
embedding_dim=128,
generator_dim=(256, 256),
discriminator_dim=(256, 256),
generator_lr=2e-4,
generator_decay=1e-6,
discriminator_lr=2e-4,
discriminator_decay=1e-6,
batch_size=500,
discriminator_steps=1,
log_frequency=True,
verbose=False,
epochs=300,
pac=10,
enable_gpu=True,
cuda=None,
):
"""
Args:
metadata (Metadata): Single table metadata.
enforce_min_max_values (bool): Clip to min/max. Defaults to True.
enforce_rounding (bool): Round as original. Defaults to True.
locales (list): Locale(s) for AnonymizedFaker. Defaults to ['en_US'].
embedding_dim (int): Random sample size for Generator. Defaults to 128.
generator_dim (tuple): Residual layer sizes. Defaults to (256, 256).
discriminator_dim (tuple): Discriminator layer sizes. Defaults to (256, 256).
generator_lr (float): Generator learning rate. Defaults to 2e-4.
generator_decay (float): Generator weight decay. Defaults to 1e-6.
discriminator_lr (float): Discriminator learning rate. Defaults to 2e-4.
discriminator_decay (float): Discriminator weight decay. Defaults to 1e-6.
batch_size (int): Batch size. Defaults to 500.
discriminator_steps (int): Discriminator updates per generator update. Defaults to 1.
log_frequency (bool): Use log frequency for conditional sampling. Defaults to True.
verbose (bool): Print progress. Defaults to False.
epochs (int): Training epochs. Defaults to 300.
pac (int): PacGAN group size. Defaults to 10.
enable_gpu (bool): Use GPU if available. Defaults to True.
cuda (bool or str): Deprecated. Use enable_gpu instead.
"""
Import
from sdv.single_table import CTGANSynthesizer
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| metadata | Metadata | Yes | Single table metadata object |
| epochs | int | No | Training epochs (default: 300) |
| batch_size | int | No | Training batch size (default: 500) |
| embedding_dim | int | No | Generator input dimension (default: 128) |
| generator_dim | tuple | No | Generator layer sizes (default: (256, 256)) |
| discriminator_dim | tuple | No | Discriminator layer sizes (default: (256, 256)) |
| enable_gpu | bool | No | Attempt GPU computation (default: True) |
Outputs
| Name | Type | Description |
|---|---|---|
| instance | CTGANSynthesizer | Unfitted synthesizer ready for .fit() call |
Usage Examples
Basic Usage
from sdv.datasets.demo import download_demo
from sdv.single_table import CTGANSynthesizer
data, metadata = download_demo(modality='single_table', dataset_name='fake_hotel_guests')
synthesizer = CTGANSynthesizer(metadata, epochs=100, verbose=True)
synthesizer.fit(data)
synthetic_data = synthesizer.sample(num_rows=500)
Related Pages
Implements Principle
Requires Environment
Uses Heuristic
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment