Principle:Sdv dev SDV Synthesizer Persistence
| Knowledge Sources | |
|---|---|
| Domains | Serialization, Synthetic_Data |
| Last Updated | 2026-02-14 00:00 GMT |
Overview
A serialization mechanism that saves fitted synthesizer state to disk and restores it for later sampling without re-training.
Description
Synthesizer persistence enables saving a fitted synthesizer to a file and loading it back. This avoids the need to re-fit the model when generating additional synthetic data later. The mechanism uses cloudpickle for serialization, which handles complex Python objects including trained models, data processors, and metadata.
Usage
Use save/load when you need to persist a fitted synthesizer for later use, share it across environments, or deploy it in production pipelines.
Theoretical Basis
The persistence mechanism follows standard Python object serialization:
- Save: The entire synthesizer object (model weights, preprocessor state, metadata) is serialized via cloudpickle.dump to a binary file
- Load: The file is deserialized via cloudpickle.load, with version compatibility checks and synthesizer type validation
- Version safety: The loaded synthesizer is checked against the current SDV version to warn about potential incompatibilities