Environment:Sdv dev SDV Python Runtime
| Knowledge Sources | |
|---|---|
| Domains | Synthetic_Data, Infrastructure |
| Last Updated | 2026-02-14 19:00 GMT |
Overview
Python 3.9–3.14 environment with pandas, numpy, copulas, ctgan, deepecho, rdt, sdmetrics, and supporting libraries for synthetic data generation.
Description
This environment provides the full runtime context for the SDV (Synthetic Data Vault) library. It is a CPU-based Python environment by default, with optional GPU acceleration for GAN-based synthesizers (CTGAN, CopulaGAN). The dependency matrix is Python-version-aware: different minimum versions of numpy, pandas, copulas, ctgan, deepecho, rdt, and sdmetrics are required depending on the Python interpreter version. System-level packages (graphviz, pandoc) are needed for metadata visualization and documentation generation.
Usage
Use this environment for all SDV workflows: single-table synthesis, multi-table synthesis, sequential data synthesis, constrained synthesis, and data quality evaluation. Every Implementation page in this wiki requires this environment as the base runtime.
System Requirements
| Category | Requirement | Notes |
|---|---|---|
| OS | Linux, macOS, Windows | Cross-platform; Linux recommended for production |
| Python | >= 3.9, < 3.15 | Supports 3.9, 3.10, 3.11, 3.12, 3.13, 3.14 |
| Disk | 500MB+ | For package installation and model caching |
Dependencies
System Packages
- `graphviz` — Required for metadata visualization (graph rendering)
- `pandoc` — Required for documentation generation
Python Packages (Core)
- `boto3` >= 1.28, < 2.0.0
- `botocore` >= 1.31, < 2.0.0
- `cloudpickle` >= 2.1.0 (Python < 3.14) or >= 3.1.1 (Python >= 3.14)
- `graphviz` >= 0.13.2
- `numpy` >= 1.22.2 (Python 3.9) / >= 1.24.0 (3.10–3.11) / >= 1.26.0 (3.12) / >= 2.1.0 (3.13) / >= 2.3.2 (3.14)
- `pandas` >= 1.4.0 (Python < 3.11) / >= 1.5.0 (3.11) / >= 2.1.1 (3.12) / >= 2.2.3 (3.13) / >= 2.3.3 (3.14), < 3
- `tqdm` >= 4.29
- `copulas` >= 0.12.1 (Python < 3.14) or >= 0.14.0 (Python >= 3.14)
- `ctgan` >= 0.11.1 (Python < 3.14) or >= 0.12.0 (Python >= 3.14)
- `deepecho` >= 0.7.0 (Python < 3.14) or >= 0.8.0 (Python >= 3.14)
- `rdt` >= 1.18.2 (Python < 3.14) or >= 1.20.0 (Python >= 3.14)
- `sdmetrics` >= 0.21.0 (Python < 3.14) or >= 0.26.0 (Python >= 3.14)
- `platformdirs` >= 4.0
- `pyyaml` >= 6.0.1
Python Packages (Optional)
- `pomegranate` >= 0.15, < 1 — For Bayesian network distributions
- `pandas[excel]` — For Excel I/O support
Credentials
No credentials are required for core SDV functionality. However:
- AWS credentials (via `boto3`): Only needed if loading demo datasets from non-public S3 buckets. The `download_demo` function uses boto3 for S3 access but connects to a public bucket by default.
Quick Install
# Install SDV with all core dependencies
pip install sdv
# Install with Excel support
pip install "sdv[excel]"
# Install with Bayesian network support
pip install "sdv[pomegranate]"
# System packages (Ubuntu/Debian)
sudo apt-get install graphviz pandoc
Code Evidence
Python version constraint from `pyproject.toml:22`:
requires-python = '>=3.9,<3.15'
Python-version-conditional numpy dependency from `pyproject.toml:30-34`:
"numpy>=1.22.2;python_version<'3.10'",
"numpy>=1.24.0;python_version>='3.10' and python_version<'3.12'",
"numpy>=1.26.0;python_version>='3.12' and python_version<'3.13'",
"numpy>=2.1.0;python_version>='3.13' and python_version<'3.14'",
"numpy>=2.3.2;python_version>='3.14'",
Optional CTGAN import handling from `sdv/single_table/ctgan.py:15-23`:
try:
from ctgan import CTGAN, TVAE
from ctgan.synthesizers._utils import get_enable_gpu_value
import_error = None
except ModuleNotFoundError as e:
CTGAN = None
TVAE = None
import_error = e
Optional deepecho import handling from `sdv/sequential/par.py:28-36`:
try:
from deepecho import PARModel
from deepecho.sequences import assemble_sequences
import_error = None
except ModuleNotFoundError as e:
PARModel = None
assemble_sequences = None
import_error = e
Custom ModuleNotFoundError from `sdv/utils/mixins.py:4-12`:
class MissingModuleMixin:
@classmethod
def raise_module_not_found_error(cls, error):
raise ModuleNotFoundError(
f"{error.msg}. Please install {error.name} in order to use the '{cls.__name__}'."
)
System packages from `apt.txt:1-3`:
# apt-get requirements for development and mybinder environment
graphviz
pandoc
Common Errors
| Error Message | Cause | Solution |
|---|---|---|
| `ModuleNotFoundError: No module named 'ctgan'. Please install ctgan in order to use the 'CTGANSynthesizer'.` | ctgan package not installed | `pip install ctgan` or `pip install sdv` |
| `ModuleNotFoundError: No module named 'deepecho'. Please install deepecho in order to use the 'PARSynthesizer'.` | deepecho package not installed | `pip install deepecho` or `pip install sdv` |
| `VersionError` when loading a saved synthesizer | Current SDV version is older than the version that created the synthesizer | Upgrade SDV to the version shown in the error message |
| `SDVVersionWarning` on load | SDV version mismatch between current and fitted versions | Retrain synthesizer on current version for latest features |
Compatibility Notes
- Python 3.14: Requires newer versions of all SDV ecosystem packages (copulas >= 0.14.0, ctgan >= 0.12.0, deepecho >= 0.8.0, rdt >= 1.20.0, sdmetrics >= 0.26.0, cloudpickle >= 3.1.1).
- Python 3.9: Minimum supported version. Uses older numpy (>= 1.22.2) and pandas (>= 1.4.0).
- Windows: Fully supported but graphviz system package may require manual installation.
- re module: SDV handles Python version differences in regex internals via fallback import (`re._parser` → `sre_parse`).
Related Pages
- Implementation:Sdv_dev_SDV_Download_Demo
- Implementation:Sdv_dev_SDV_Metadata_Detect_From_Dataframes
- Implementation:Sdv_dev_SDV_Metadata_Load_From_Json
- Implementation:Sdv_dev_SDV_GaussianCopulaSynthesizer_Init
- Implementation:Sdv_dev_SDV_CTGANSynthesizer_Init
- Implementation:Sdv_dev_SDV_BaseSynthesizer_Fit
- Implementation:Sdv_dev_SDV_BaseSingleTableSynthesizer_Sample
- Implementation:Sdv_dev_SDV_BaseSynthesizer_Save_Load
- Implementation:Sdv_dev_SDV_HMASynthesizer_Init
- Implementation:Sdv_dev_SDV_BaseMultiTableSynthesizer_Fit
- Implementation:Sdv_dev_SDV_BaseMultiTableSynthesizer_Sample
- Implementation:Sdv_dev_SDV_PARSynthesizer_Init
- Implementation:Sdv_dev_SDV_PARSynthesizer_Fit
- Implementation:Sdv_dev_SDV_PARSynthesizer_Sample
- Implementation:Sdv_dev_SDV_Simplify_Schema
- Implementation:Sdv_dev_SDV_Inequality_Init
- Implementation:Sdv_dev_SDV_FixedCombinations_Init
- Implementation:Sdv_dev_SDV_Range_Init
- Implementation:Sdv_dev_SDV_ProgrammableConstraint_Interface
- Implementation:Sdv_dev_SDV_BaseSynthesizer_Add_Constraints
- Implementation:Sdv_dev_SDV_BaseConstraint_Is_Valid
- Implementation:Sdv_dev_SDV_Evaluate_Quality_Single_Table
- Implementation:Sdv_dev_SDV_Evaluate_Quality_Multi_Table
- Implementation:Sdv_dev_SDV_Run_Diagnostic
- Implementation:Sdv_dev_SDV_Get_Column_Plot
- Implementation:Sdv_dev_SDV_Get_Column_Pair_Plot
- Implementation:Sdv_dev_SDV_Get_Cardinality_Plot