Implementation:Sdv dev SDV Download Demo

Knowledge Sources	SDV SDV Documentation
Domains	Data_Science, Synthetic_Data
Last Updated	2026-02-14 00:00 GMT

Overview

Concrete tool for downloading demo datasets from the SDV public S3 bucket, provided by the SDV library.

Description

The download_demo function fetches pre-curated datasets from the SDV public S3 bucket. It downloads both the raw data files and the metadata definition, returning them as a ready-to-use tuple. The function supports three modalities: single_table (returns a single DataFrame), multi_table (returns a dictionary of DataFrames), and sequential (returns a DataFrame with sequence key columns).

Usage

Import this function when you need sample data for testing or prototyping an SDV synthesis pipeline. It is the standard entry point for all SDV demo workflows and tutorials.

Code Reference

Source Location

Repository: SDV
File: sdv/datasets/demo.py
Lines: L428-478

Signature

def download_demo(
    modality,
    dataset_name,
    output_folder_name=None,
    s3_bucket_name='sdv-datasets-public',
    credentials=None
):
    """Download a demo dataset.

    Args:
        modality (str):
            The modality of the dataset: 'single_table', 'multi_table', 'sequential'.
        dataset_name (str):
            Name of the dataset to be downloaded from the S3 bucket.
        output_folder_name (str or None):
            The name of the local folder where the metadata and data should be stored.
            If None the data is not saved locally and is loaded as a Python object.
            Defaults to None.
        s3_bucket_name (str):
            The name of the bucket to download from.
        credentials (dict):
            Dictionary containing DataCebo license key and username.

    Returns:
        tuple (data, metadata):
            If data is single table or sequential, it is a DataFrame.
            If data is multi table, it is a dictionary mapping table name to DataFrame.
            metadata is of class Metadata.
    """

Import

from sdv.datasets.demo import download_demo

I/O Contract

Inputs

Name	Type	Required	Description
modality	str	Yes	One of 'single_table', 'multi_table', 'sequential'
dataset_name	str	Yes	Name of dataset in the S3 bucket
output_folder_name	str or None	No	Local folder to save data; None = in-memory only
s3_bucket_name	str	No	S3 bucket name (default: 'sdv-datasets-public')
credentials	dict or None	No	License credentials for enterprise buckets

Outputs

Name	Type	Description
data	pd.DataFrame or dict[str, pd.DataFrame]	Single DataFrame for single_table/sequential; dict for multi_table
metadata	Metadata	Metadata object describing the dataset schema

Usage Examples

Single Table Demo

from sdv.datasets.demo import download_demo

# Download a single-table demo dataset
data, metadata = download_demo(
    modality='single_table',
    dataset_name='fake_hotel_guests'
)

print(data.head())
print(metadata)

Multi Table Demo

from sdv.datasets.demo import download_demo

# Download a multi-table demo dataset
data, metadata = download_demo(
    modality='multi_table',
    dataset_name='fake_hotels'
)

# data is a dict of DataFrames
for table_name, df in data.items():
    print(f"{table_name}: {len(df)} rows")

Sequential Demo

from sdv.datasets.demo import download_demo

# Download a sequential demo dataset
data, metadata = download_demo(
    modality='sequential',
    dataset_name='nasdaq100_2019'
)

print(data.head())

Related Pages

Implements Principle

Principle:Sdv_dev_SDV_Demo_Data_Loading

Requires Environment

Environment:Sdv_dev_SDV_Python_Runtime

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment