Implementation:Sdv dev SDV GaussianCopulaSynthesizer Init
| Knowledge Sources | |
|---|---|
| Domains | Statistics, Synthetic_Data |
| Last Updated | 2026-02-14 00:00 GMT |
Overview
Concrete tool for creating a Gaussian copula-based synthesizer for single-table synthetic data generation, provided by the SDV library.
Description
The GaussianCopulaSynthesizer wraps the copulas.multivariate.GaussianMultivariate model. It supports configurable per-column univariate distributions (norm, beta, truncnorm, uniform, gamma, gaussian_kde) and a default distribution. The synthesizer inherits the standard SDV lifecycle: init, fit, sample, save, load.
Usage
Import this class when you need to generate single-table synthetic data using a statistical (non-deep-learning) approach. It is the default and most commonly used synthesizer in SDV, suitable for most tabular datasets.
Code Reference
Source Location
- Repository: SDV
- File: sdv/single_table/copulas.py
- Lines: L27-121
Signature
class GaussianCopulaSynthesizer(BaseSingleTableSynthesizer):
def __init__(
self,
metadata,
enforce_min_max_values=True,
enforce_rounding=True,
locales=['en_US'],
numerical_distributions=None,
default_distribution=None,
):
"""
Args:
metadata (Metadata): Single table metadata.
enforce_min_max_values (bool): Clip to min/max seen during fit. Defaults to True.
enforce_rounding (bool): Round as in original data. Defaults to True.
locales (list): Locale(s) for AnonymizedFaker. Defaults to ['en_US'].
numerical_distributions (dict): Per-column distribution overrides.
Keys are column names, values are distribution strings.
default_distribution (str): Default univariate distribution.
Options: 'norm', 'beta', 'truncnorm', 'uniform', 'gamma', 'gaussian_kde'.
Defaults to 'beta'.
"""
Import
from sdv.single_table import GaussianCopulaSynthesizer
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| metadata | Metadata | Yes | Single table metadata object |
| enforce_min_max_values | bool | No | Clip numerical values to observed range (default: True) |
| enforce_rounding | bool | No | Round numerical values as in original data (default: True) |
| locales | list | No | Locale(s) for fake data generation (default: ['en_US']) |
| numerical_distributions | dict | No | Per-column distribution overrides (column_name -> distribution_string) |
| default_distribution | str | No | Default univariate distribution (default: 'beta') |
Outputs
| Name | Type | Description |
|---|---|---|
| instance | GaussianCopulaSynthesizer | Unfitted synthesizer ready for .fit() call |
Usage Examples
Basic Usage
from sdv.datasets.demo import download_demo
from sdv.single_table import GaussianCopulaSynthesizer
# Load demo data
data, metadata = download_demo(modality='single_table', dataset_name='fake_hotel_guests')
# Create synthesizer with default settings
synthesizer = GaussianCopulaSynthesizer(metadata)
# Fit and sample
synthesizer.fit(data)
synthetic_data = synthesizer.sample(num_rows=500)
Custom Distributions
from sdv.single_table import GaussianCopulaSynthesizer
synthesizer = GaussianCopulaSynthesizer(
metadata,
numerical_distributions={
'age': 'truncnorm',
'room_rate': 'gamma',
},
default_distribution='norm',
)