Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Sdv dev SDV Gaussian Copula Synthesis

From Leeroopedia
Knowledge Sources
Domains Statistics, Synthetic_Data, Probabilistic_Modeling
Last Updated 2026-02-14 00:00 GMT

Overview

A statistical modeling technique that captures multivariate dependencies between columns using a Gaussian copula to generate synthetic tabular data.

Description

Gaussian copula synthesis separates the modeling of individual column distributions (marginals) from the modeling of inter-column dependencies (the copula). Each column is first transformed to follow a standard normal distribution using its fitted univariate distribution (e.g., beta, truncated normal, gamma). The correlations between these transformed columns are then captured by a multivariate Gaussian distribution. During sampling, correlated normal samples are drawn and then inverse-transformed through each column's marginal distribution to produce realistic synthetic data.

This approach is computationally efficient, interpretable, and works well for datasets with moderate complexity. It is the default and recommended synthesizer in SDV for most single-table use cases.

Usage

Use Gaussian copula synthesis when generating single-table synthetic data where statistical fidelity of column distributions and correlations is important. It is preferred over deep learning approaches (CTGAN) when the dataset is small to medium-sized, training speed matters, or when interpretable learned distributions are needed (e.g., via get_parameters/get_learned_distributions).

Theoretical Basis

The Gaussian copula model works in three stages:

1. Marginal Fitting: Each column Xi is fitted with a univariate distribution Fi (e.g., Beta, Gaussian KDE, Truncated Normal).

2. Copula Estimation: Transform each column to uniform via the probability integral transform: Ui=Fi(Xi)

Then transform to standard normal: Zi=Φ1(Ui)

Estimate the correlation matrix Σ of the Zi vectors.

3. Sampling: Draw samples from 𝒩(0,Σ), then invert through Φ and Fi1: Xisynth=Fi1(Φ(Zisynth))

Related Pages

Implemented By

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment