Implementation:Gretelai Gretel synthetics ACTGAN Fit
Appearance
| Knowledge Sources | |
|---|---|
| Domains | Synthetic_Data, GAN, Tabular_Data |
| Last Updated | 2026-02-14 19:00 GMT |
Overview
Concrete tool for fitting the ACTGAN model to a tabular dataset, including datetime detection, metadata transformation, and data encoding, provided by the gretel-synthetics library.
Description
The ACTGAN.fit(data) method orchestrates a multi-stage data preparation and model fitting pipeline. It performs the following steps in order:
- Datetime auto-detection: If
auto_transform_datetimes=True, callsSDVTableMetadata.fit_datetime(data)to scan columns for datetime patterns and automatically configureUnixTimestampEncodertransformers. - Empty column handling: Calls
SDVTableMetadata.fit_empty_columns(data)to identify all-NaN columns and assign anEmptyFieldTransformer. - SDV metadata transform: Delegates to the SDV
BaseTabularModel.fit()which applies the configured field transformers via the RDT package. For efficiency, the DataFrame is converted to aColumnarDFrepresentation during this phase to avoid expensivepd.concatoperations. - Categorical column detection: In
_fit(), columns are classified as categorical or continuous by inspecting their dtype and value patterns. Float columns containing only 0.0 and 1.0 are treated as boolean. - DataTransformer.fit(): Fits a
ClusterBasedNormalizer(Bayesian GMM) for each continuous column and aOneHotEncoderorBinaryEncodingTransformerfor each discrete column, depending on thebinary_encoder_cutoffthreshold. - Pre-fit transform: Calls
ACTGANSynthesizer._pre_fit_transform()which runsDataTransformer.fit(), transforms the data into a decodedTrainDatarepresentation, and sets up activation functions and conditional loss column ranges. - GAN training: Calls
ACTGANSynthesizer._actual_fit()to run the adversarial training loop.
Usage
Call model.fit(data) after creating an ACTGAN instance and before calling sample(). The data argument can be a pandas.DataFrame or a path to a CSV file.
Code Reference
Source Location
- Repository: gretel-synthetics
- File:
src/gretel_synthetics/actgan/actgan_wrapper.py(lines 57-141),src/gretel_synthetics/actgan/data_transformer.py(lines 151-194),src/gretel_synthetics/detectors/sdv.py(lines 164-180)
Signature
# actgan_wrapper.py - _ACTGANModel.fit
def fit(self, data: Union[pd.DataFrame, str]) -> None:
# actgan_wrapper.py - ACTGAN.fit (overrides to apply float formatter patch)
def fit(self, *args, **kwargs):
# data_transformer.py - DataTransformer.fit
def fit(
self, raw_data: DFLike, discrete_columns: Optional[Sequence[str]] = None
) -> None:
# detectors/sdv.py - SDVTableMetadata.fit_datetime
def fit_datetime(
self,
data: pd.DataFrame,
sample_size: Optional[int] = None,
with_suffix: bool = False,
must_match_all: bool = False,
) -> None:
Import
from gretel_synthetics.actgan.actgan_wrapper import ACTGAN
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| data | Union[pd.DataFrame, str] |
Yes | Training data as a pandas DataFrame or a path string to a CSV file. When a DataFrame is provided, datetime auto-detection and empty column handling are performed. |
Outputs
| Name | Type | Description |
|---|---|---|
| (none) | None |
The method modifies the ACTGAN instance in place. After fitting, the internal _model attribute holds a trained ACTGANSynthesizer with a fitted DataTransformer and trained generator/discriminator networks.
|
Internal Pipeline Details
The fit pipeline involves three layers of the class hierarchy:
| Layer | Class | Method | Responsibility |
|---|---|---|---|
| 1 | ACTGAN |
fit(*args, **kwargs) |
Applies float formatter rounding bug patch, delegates to parent |
| 2 | _ACTGANModel |
fit(data) |
Runs datetime/empty-column detection via SDVTableMetadata, converts DataFrame to ColumnarDF for efficiency, delegates to SDV BaseTabularModel.fit() |
| 3 | _ACTGANModel |
_fit(table_data) |
Converts ColumnarDF back to DataFrame, builds ACTGANSynthesizer via _build_model(), identifies categorical columns, calls synthesizer.fit() |
| 4 | ACTGANSynthesizer |
fit(train_data, discrete_columns) |
Calls _pre_fit_transform() then _actual_fit() |
| 5 | ACTGANSynthesizer |
_pre_fit_transform(train_data, discrete_columns) |
Creates DataTransformer, fits and transforms data, sets up activation functions |
| 6 | DataTransformer |
fit(raw_data, discrete_columns) |
Fits ClusterBasedNormalizer per continuous column and OHE/Binary encoder per discrete column |
Usage Examples
Basic Example
import pandas as pd
from gretel_synthetics.actgan.actgan_wrapper import ACTGAN
# Load tabular data
data = pd.read_csv("customers.csv")
# Create and fit the model
model = ACTGAN(epochs=100, verbose=True, auto_transform_datetimes=True)
model.fit(data)
Example with Explicit Field Types
import pandas as pd
from gretel_synthetics.actgan.actgan_wrapper import ACTGAN
data = pd.read_csv("transactions.csv")
model = ACTGAN(
field_types={
"transaction_date": {"type": "datetime", "format": "%Y-%m-%d"},
"category": {"type": "categorical"},
},
binary_encoder_cutoff=300,
cbn_sample_size=100_000,
epochs=200,
verbose=True,
)
model.fit(data)
Related Pages
Implements Principle
Requires Environment
Uses Heuristic
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment