Principle:Sdv dev SDV CTGAN Synthesis

Knowledge Sources	CTGAN SDV Documentation SDV
Domains	Deep_Learning, Synthetic_Data, Generative_Adversarial_Networks
Last Updated	2026-02-14 00:00 GMT

Overview

A deep learning-based technique that uses conditional generative adversarial networks to synthesize realistic tabular data with mixed column types.

Description

CTGAN (Conditional Tabular GAN) addresses the challenges of applying GANs to tabular data, where columns have mixed types (continuous and categorical), imbalanced categories, and multi-modal continuous distributions. It introduces mode-specific normalization for continuous columns and a conditional generator with training-by-sampling to handle imbalanced categorical columns.

Unlike copula-based approaches, CTGAN can capture complex non-linear relationships between columns. However, it requires more data and longer training times.

Usage

Use CTGAN synthesis when the dataset has complex non-linear inter-column relationships that a Gaussian copula cannot capture. It is particularly useful for larger datasets where the additional training cost is justified by improved fidelity.

Theoretical Basis

CTGAN consists of a generator and discriminator trained adversarially:

1. Mode-Specific Normalization: Continuous values are normalized using a variational Gaussian mixture model to handle multi-modal distributions.

2. Conditional Generator: During training, a categorical column and specific value are sampled as a condition. The generator must produce rows matching that condition.

3. Training-by-Sampling: Training batches are sampled to ensure all categories are represented evenly, addressing class imbalance.

4. PacGAN Discriminator: Multiple samples are packed together as input to the discriminator to prevent mode collapse.

Related Pages

Implemented By

Implementation:Sdv_dev_SDV_CTGANSynthesizer_Init

Uses Heuristic

Heuristic:Sdv_dev_SDV_CTGAN_Column_Performance

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment