Principle:Sdv dev SDV Multi Table Model Fitting
| Knowledge Sources | |
|---|---|
| Domains | Machine_Learning, Synthetic_Data, Relational_Data |
| Last Updated | 2026-02-14 00:00 GMT |
Overview
A multi-table training pipeline that preprocesses relational data, augments parent tables with child statistics, and fits per-table models.
Description
Multi-table model fitting extends single-table fitting to relational datasets. The process preprocesses each table independently, then augments parent tables with extension columns that capture child table statistics. Each augmented table is then fitted with its own single-table synthesizer. For HMA, this involves computing means, standard deviations, and frequency distributions of child columns, adding them to the parent, and fitting GaussianCopulaSynthesizers.
Usage
Call fit on an HMASynthesizer after initialization with multi-table data and metadata. The data must be a dictionary mapping table names to DataFrames.
Theoretical Basis
- Preprocessing: Each table is independently preprocessed via its DataProcessor
- Augmentation: Parent tables are augmented with statistical summaries of child columns
- Per-table fitting: Each augmented table is fitted with GaussianCopulaSynthesizer
- State tracking: The synthesizer records fitting metadata and marks itself as fitted