Implementation:Sdv dev SDV BaseSynthesizer Fit
| Knowledge Sources | |
|---|---|
| Domains | Machine_Learning, Synthetic_Data |
| Last Updated | 2026-02-14 00:00 GMT |
Overview
Concrete tool for fitting a single-table synthesizer on real data, provided by the SDV library.
Description
The BaseSynthesizer.fit method is the primary training entry point for all single-table synthesizers. It preprocesses the raw DataFrame via the DataProcessor pipeline, then calls the model-specific _fit method. After fitting, the synthesizer's internal state is updated to reflect that it is ready for sampling.
Usage
Call this method after initializing any single-table synthesizer (GaussianCopulaSynthesizer, CTGANSynthesizer, etc.) with a metadata object. The data must be a pandas DataFrame matching the metadata schema.
Code Reference
Source Location
- Repository: SDV
- File: sdv/single_table/base.py
- Lines: L675-701
Signature
def fit(self, data):
"""Fit this model to the original data.
Args:
data (pandas.DataFrame):
The raw data (before any transformations) to fit the model to.
"""
Import
from sdv.single_table import GaussianCopulaSynthesizer
# fit is called as: synthesizer.fit(data)
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| data | pd.DataFrame | Yes | Raw data before any transformations |
Outputs
| Name | Type | Description |
|---|---|---|
| (mutates self) | None | Sets self._fitted = True; internally preprocesses and trains model |
Usage Examples
from sdv.datasets.demo import download_demo
from sdv.single_table import GaussianCopulaSynthesizer
data, metadata = download_demo(modality='single_table', dataset_name='fake_hotel_guests')
synthesizer = GaussianCopulaSynthesizer(metadata)
# Fit the synthesizer on real data
synthesizer.fit(data)