Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Sdv dev SDV Single Table Model Fitting

From Leeroopedia
Knowledge Sources
Domains Machine_Learning, Synthetic_Data
Last Updated 2026-02-14 00:00 GMT

Overview

A data preprocessing and model training pipeline that transforms raw tabular data and fits a statistical or neural model to learn its distributions.

Description

Single-table model fitting is the core training step in any SDV synthesis workflow. It takes a raw DataFrame, applies a series of preprocessing transformations (type conversion, anonymization, numerical formatting via the DataProcessor and HyperTransformer), and then fits the underlying statistical or neural model on the transformed data.

The fitting pipeline handles constraint transformations (if any constraints are registered), column type conversions, missing value imputation, and produces a fitted model that can subsequently generate synthetic data.

Usage

Use model fitting after initializing a synthesizer and before sampling. The fit method must be called with the complete real dataset. After fitting, the synthesizer is ready to generate synthetic data via the sample method.

Theoretical Basis

The fitting pipeline follows these stages:

  1. Preprocessing: Raw data passes through DataProcessor, which applies HyperTransformer to convert columns to model-compatible formats
  2. Constraint transformation: If constraints are registered, data is transformed to satisfy constraint-aware representations
  3. Model fitting: The preprocessed data is passed to the underlying model's _fit method (e.g., GaussianMultivariate.fit or CTGAN.fit)
  4. State update: The synthesizer marks itself as fitted and records metadata about the fit operation

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment