Heuristic:Sdv dev SDV Version Compatibility
| Knowledge Sources | |
|---|---|
| Domains | Debugging, Synthetic_Data |
| Last Updated | 2026-02-14 19:00 GMT |
Overview
Always retrain synthesizers when upgrading SDV versions; loading synthesizers saved with a different SDV version may produce warnings or errors.
Description
SDV records the library version used to fit each synthesizer. When a synthesizer is loaded via `.load()`, SDV compares the current version against the fitted version. If the current version is older than the fitted version, a `VersionError` is raised (downgrade not supported). If the versions simply differ, a `SDVVersionWarning` is issued recommending retraining. This version tracking also applies to SDV Enterprise editions with separate enterprise version tracking.
Usage
Apply this heuristic whenever upgrading or downgrading the SDV library. If you have saved synthesizers from a previous version:
- Upgrading: Synthesizers will load but with a version mismatch warning. Retrain for latest bug fixes and features.
- Downgrading: Synthesizers from a newer version cannot be loaded in an older version (raises `VersionError`).
The Insight (Rule of Thumb)
- Action: After upgrading SDV, retrain all synthesizers by calling `.fit()` again with the original data. Do not rely on `.load()` for production synthesizers across version changes.
- Value: Retraining ensures you benefit from bug fixes, performance improvements, and new features in the upgraded version.
- Trade-off: Retraining takes time proportional to dataset size. For large datasets with expensive training, consider maintaining version-locked environments.
- Recommendation: Always save metadata using `Metadata.save_to_json()` for replicability across SDV versions. Metadata format is more stable than pickled synthesizer format.
Reasoning
SDV uses `cloudpickle` for serialization, which captures the full Python object graph including internal state, model weights, and library references. Changes to internal APIs between versions can silently break deserialized objects. The version check prevents subtle data corruption from using a synthesizer whose internal state is incompatible with the current codebase.
The warning message specifically states: "The latest bug fixes and features may not be available for this synthesizer. To see these enhancements, create and train a new synthesizer on this version."
Code Evidence
Version mismatch check from `sdv/_utils.py:262-310`:
def check_sdv_versions_and_warn(synthesizer):
current_community_version = getattr(version, 'community', None)
current_enterprise_version = getattr(version, 'enterprise', None)
if getattr(synthesizer, '_fitted', False):
fitted_community_version = getattr(synthesizer, '_fitted_sdv_version', None)
fitted_enterprise_version = getattr(synthesizer, '_fitted_sdv_enterprise_version', None)
community_mismatch = current_community_version != fitted_community_version
enterprise_mismatch = current_enterprise_version != fitted_enterprise_version
if community_mismatch or enterprise_mismatch:
message = f'{message} {static_message}'
warnings.warn(message, SDVVersionWarning)
Metadata save recommendation from `sdv/single_table/base.py:134-137`:
warnings.warn(
"We strongly recommend saving the metadata using 'save_to_json' for "
'replicability in future SDV versions.',
FutureWarning,
)
Version error classes from `sdv/errors.py:69-82`:
class SDVVersionWarning(UserWarning):
"""Warning to be raised if there is a version mismatch."""
class VersionError(ValueError):
"""Raised when loading a synthesizer from a newer version into an older one."""