Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Sdv dev SDV GaussianCopulaSynthesizer Init

From Leeroopedia
Knowledge Sources
Domains Statistics, Synthetic_Data
Last Updated 2026-02-14 00:00 GMT

Overview

Concrete tool for creating a Gaussian copula-based synthesizer for single-table synthetic data generation, provided by the SDV library.

Description

The GaussianCopulaSynthesizer wraps the copulas.multivariate.GaussianMultivariate model. It supports configurable per-column univariate distributions (norm, beta, truncnorm, uniform, gamma, gaussian_kde) and a default distribution. The synthesizer inherits the standard SDV lifecycle: init, fit, sample, save, load.

Usage

Import this class when you need to generate single-table synthetic data using a statistical (non-deep-learning) approach. It is the default and most commonly used synthesizer in SDV, suitable for most tabular datasets.

Code Reference

Source Location

  • Repository: SDV
  • File: sdv/single_table/copulas.py
  • Lines: L27-121

Signature

class GaussianCopulaSynthesizer(BaseSingleTableSynthesizer):
    def __init__(
        self,
        metadata,
        enforce_min_max_values=True,
        enforce_rounding=True,
        locales=['en_US'],
        numerical_distributions=None,
        default_distribution=None,
    ):
        """
        Args:
            metadata (Metadata): Single table metadata.
            enforce_min_max_values (bool): Clip to min/max seen during fit. Defaults to True.
            enforce_rounding (bool): Round as in original data. Defaults to True.
            locales (list): Locale(s) for AnonymizedFaker. Defaults to ['en_US'].
            numerical_distributions (dict): Per-column distribution overrides.
                Keys are column names, values are distribution strings.
            default_distribution (str): Default univariate distribution.
                Options: 'norm', 'beta', 'truncnorm', 'uniform', 'gamma', 'gaussian_kde'.
                Defaults to 'beta'.
        """

Import

from sdv.single_table import GaussianCopulaSynthesizer

I/O Contract

Inputs

Name Type Required Description
metadata Metadata Yes Single table metadata object
enforce_min_max_values bool No Clip numerical values to observed range (default: True)
enforce_rounding bool No Round numerical values as in original data (default: True)
locales list No Locale(s) for fake data generation (default: ['en_US'])
numerical_distributions dict No Per-column distribution overrides (column_name -> distribution_string)
default_distribution str No Default univariate distribution (default: 'beta')

Outputs

Name Type Description
instance GaussianCopulaSynthesizer Unfitted synthesizer ready for .fit() call

Usage Examples

Basic Usage

from sdv.datasets.demo import download_demo
from sdv.single_table import GaussianCopulaSynthesizer

# Load demo data
data, metadata = download_demo(modality='single_table', dataset_name='fake_hotel_guests')

# Create synthesizer with default settings
synthesizer = GaussianCopulaSynthesizer(metadata)

# Fit and sample
synthesizer.fit(data)
synthetic_data = synthesizer.sample(num_rows=500)

Custom Distributions

from sdv.single_table import GaussianCopulaSynthesizer

synthesizer = GaussianCopulaSynthesizer(
    metadata,
    numerical_distributions={
        'age': 'truncnorm',
        'room_rate': 'gamma',
    },
    default_distribution='norm',
)

Related Pages

Implements Principle

Requires Environment

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment