Principle:Sdv dev SDV Single Table Model Fitting

Knowledge Sources	SDV Documentation SDV
Domains	Machine_Learning, Synthetic_Data
Last Updated	2026-02-14 00:00 GMT

Overview

A data preprocessing and model training pipeline that transforms raw tabular data and fits a statistical or neural model to learn its distributions.

Description

Single-table model fitting is the core training step in any SDV synthesis workflow. It takes a raw DataFrame, applies a series of preprocessing transformations (type conversion, anonymization, numerical formatting via the DataProcessor and HyperTransformer), and then fits the underlying statistical or neural model on the transformed data.

The fitting pipeline handles constraint transformations (if any constraints are registered), column type conversions, missing value imputation, and produces a fitted model that can subsequently generate synthetic data.

Usage

Use model fitting after initializing a synthesizer and before sampling. The fit method must be called with the complete real dataset. After fitting, the synthesizer is ready to generate synthetic data via the sample method.

Theoretical Basis

The fitting pipeline follows these stages:

Preprocessing: Raw data passes through DataProcessor, which applies HyperTransformer to convert columns to model-compatible formats
Constraint transformation: If constraints are registered, data is transformed to satisfy constraint-aware representations
Model fitting: The preprocessed data is passed to the underlying model's _fit method (e.g., GaussianMultivariate.fit or CTGAN.fit)
State update: The synthesizer marks itself as fitted and records metadata about the fit operation

Related Pages

Implemented By

Implementation:Sdv_dev_SDV_BaseSynthesizer_Fit

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment