Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Gretelai Gretel synthetics ACTGAN Fit

From Leeroopedia
Knowledge Sources
Domains Synthetic_Data, GAN, Tabular_Data
Last Updated 2026-02-14 19:00 GMT

Overview

Concrete tool for fitting the ACTGAN model to a tabular dataset, including datetime detection, metadata transformation, and data encoding, provided by the gretel-synthetics library.

Description

The ACTGAN.fit(data) method orchestrates a multi-stage data preparation and model fitting pipeline. It performs the following steps in order:

  1. Datetime auto-detection: If auto_transform_datetimes=True, calls SDVTableMetadata.fit_datetime(data) to scan columns for datetime patterns and automatically configure UnixTimestampEncoder transformers.
  2. Empty column handling: Calls SDVTableMetadata.fit_empty_columns(data) to identify all-NaN columns and assign an EmptyFieldTransformer.
  3. SDV metadata transform: Delegates to the SDV BaseTabularModel.fit() which applies the configured field transformers via the RDT package. For efficiency, the DataFrame is converted to a ColumnarDF representation during this phase to avoid expensive pd.concat operations.
  4. Categorical column detection: In _fit(), columns are classified as categorical or continuous by inspecting their dtype and value patterns. Float columns containing only 0.0 and 1.0 are treated as boolean.
  5. DataTransformer.fit(): Fits a ClusterBasedNormalizer (Bayesian GMM) for each continuous column and a OneHotEncoder or BinaryEncodingTransformer for each discrete column, depending on the binary_encoder_cutoff threshold.
  6. Pre-fit transform: Calls ACTGANSynthesizer._pre_fit_transform() which runs DataTransformer.fit(), transforms the data into a decoded TrainData representation, and sets up activation functions and conditional loss column ranges.
  7. GAN training: Calls ACTGANSynthesizer._actual_fit() to run the adversarial training loop.

Usage

Call model.fit(data) after creating an ACTGAN instance and before calling sample(). The data argument can be a pandas.DataFrame or a path to a CSV file.

Code Reference

Source Location

  • Repository: gretel-synthetics
  • File: src/gretel_synthetics/actgan/actgan_wrapper.py (lines 57-141), src/gretel_synthetics/actgan/data_transformer.py (lines 151-194), src/gretel_synthetics/detectors/sdv.py (lines 164-180)

Signature

# actgan_wrapper.py - _ACTGANModel.fit
def fit(self, data: Union[pd.DataFrame, str]) -> None:

# actgan_wrapper.py - ACTGAN.fit (overrides to apply float formatter patch)
def fit(self, *args, **kwargs):

# data_transformer.py - DataTransformer.fit
def fit(
    self, raw_data: DFLike, discrete_columns: Optional[Sequence[str]] = None
) -> None:

# detectors/sdv.py - SDVTableMetadata.fit_datetime
def fit_datetime(
    self,
    data: pd.DataFrame,
    sample_size: Optional[int] = None,
    with_suffix: bool = False,
    must_match_all: bool = False,
) -> None:

Import

from gretel_synthetics.actgan.actgan_wrapper import ACTGAN

I/O Contract

Inputs

Name Type Required Description
data Union[pd.DataFrame, str] Yes Training data as a pandas DataFrame or a path string to a CSV file. When a DataFrame is provided, datetime auto-detection and empty column handling are performed.

Outputs

Name Type Description
(none) None The method modifies the ACTGAN instance in place. After fitting, the internal _model attribute holds a trained ACTGANSynthesizer with a fitted DataTransformer and trained generator/discriminator networks.

Internal Pipeline Details

The fit pipeline involves three layers of the class hierarchy:

Layer Class Method Responsibility
1 ACTGAN fit(*args, **kwargs) Applies float formatter rounding bug patch, delegates to parent
2 _ACTGANModel fit(data) Runs datetime/empty-column detection via SDVTableMetadata, converts DataFrame to ColumnarDF for efficiency, delegates to SDV BaseTabularModel.fit()
3 _ACTGANModel _fit(table_data) Converts ColumnarDF back to DataFrame, builds ACTGANSynthesizer via _build_model(), identifies categorical columns, calls synthesizer.fit()
4 ACTGANSynthesizer fit(train_data, discrete_columns) Calls _pre_fit_transform() then _actual_fit()
5 ACTGANSynthesizer _pre_fit_transform(train_data, discrete_columns) Creates DataTransformer, fits and transforms data, sets up activation functions
6 DataTransformer fit(raw_data, discrete_columns) Fits ClusterBasedNormalizer per continuous column and OHE/Binary encoder per discrete column

Usage Examples

Basic Example

import pandas as pd
from gretel_synthetics.actgan.actgan_wrapper import ACTGAN

# Load tabular data
data = pd.read_csv("customers.csv")

# Create and fit the model
model = ACTGAN(epochs=100, verbose=True, auto_transform_datetimes=True)
model.fit(data)

Example with Explicit Field Types

import pandas as pd
from gretel_synthetics.actgan.actgan_wrapper import ACTGAN

data = pd.read_csv("transactions.csv")

model = ACTGAN(
    field_types={
        "transaction_date": {"type": "datetime", "format": "%Y-%m-%d"},
        "category": {"type": "categorical"},
    },
    binary_encoder_cutoff=300,
    cbn_sample_size=100_000,
    epochs=200,
    verbose=True,
)
model.fit(data)

Related Pages

Implements Principle

Requires Environment

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment