Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Sdv dev SDV HMA Synthesis

From Leeroopedia
Knowledge Sources
Domains Synthetic_Data, Relational_Data, Hierarchical_Modeling
Last Updated 2026-02-14 00:00 GMT

Overview

A hierarchical modeling algorithm that synthesizes multi-table relational data by augmenting parent tables with statistical summaries of their child tables.

Description

HMA (Hierarchical Modeling Algorithm) handles the core challenge of multi-table synthesis: preserving referential integrity and inter-table statistical relationships. It works by augmenting parent tables with extension columns that summarize the distributions of child table columns (means, standard deviations, and counts). A single-table synthesizer (GaussianCopulaSynthesizer) is then fitted to each augmented table. During sampling, parent rows are generated first, their extension columns are used to parameterize child row generation, and referential integrity is maintained through hierarchical parent-first sampling.

Usage

Use HMA synthesis for any multi-table relational dataset where preserving inter-table relationships is important. It is the default and primary multi-table synthesizer in SDV.

Theoretical Basis

HMA operates in two phases:

Fitting Phase:

  1. For each parent-child relationship, compute extension columns on the parent:
    • Count of child rows per parent
    • Mean and standard deviation of each numerical child column
    • Frequency distributions of categorical child columns
  2. Fit a GaussianCopulaSynthesizer on each augmented parent table

Sampling Phase:

  1. Sample root table rows (including extension columns)
  2. For each parent row, use extension column values to parameterize child generation
  3. Recursively sample children, then grandchildren
  4. Drop extension columns from final output

Related Pages

Implemented By

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment