Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Sdv dev SDV Download Demo

From Leeroopedia
Knowledge Sources
Domains Data_Science, Synthetic_Data
Last Updated 2026-02-14 00:00 GMT

Overview

Concrete tool for downloading demo datasets from the SDV public S3 bucket, provided by the SDV library.

Description

The download_demo function fetches pre-curated datasets from the SDV public S3 bucket. It downloads both the raw data files and the metadata definition, returning them as a ready-to-use tuple. The function supports three modalities: single_table (returns a single DataFrame), multi_table (returns a dictionary of DataFrames), and sequential (returns a DataFrame with sequence key columns).

Usage

Import this function when you need sample data for testing or prototyping an SDV synthesis pipeline. It is the standard entry point for all SDV demo workflows and tutorials.

Code Reference

Source Location

  • Repository: SDV
  • File: sdv/datasets/demo.py
  • Lines: L428-478

Signature

def download_demo(
    modality,
    dataset_name,
    output_folder_name=None,
    s3_bucket_name='sdv-datasets-public',
    credentials=None
):
    """Download a demo dataset.

    Args:
        modality (str):
            The modality of the dataset: 'single_table', 'multi_table', 'sequential'.
        dataset_name (str):
            Name of the dataset to be downloaded from the S3 bucket.
        output_folder_name (str or None):
            The name of the local folder where the metadata and data should be stored.
            If None the data is not saved locally and is loaded as a Python object.
            Defaults to None.
        s3_bucket_name (str):
            The name of the bucket to download from.
        credentials (dict):
            Dictionary containing DataCebo license key and username.

    Returns:
        tuple (data, metadata):
            If data is single table or sequential, it is a DataFrame.
            If data is multi table, it is a dictionary mapping table name to DataFrame.
            metadata is of class Metadata.
    """

Import

from sdv.datasets.demo import download_demo

I/O Contract

Inputs

Name Type Required Description
modality str Yes One of 'single_table', 'multi_table', 'sequential'
dataset_name str Yes Name of dataset in the S3 bucket
output_folder_name str or None No Local folder to save data; None = in-memory only
s3_bucket_name str No S3 bucket name (default: 'sdv-datasets-public')
credentials dict or None No License credentials for enterprise buckets

Outputs

Name Type Description
data pd.DataFrame or dict[str, pd.DataFrame] Single DataFrame for single_table/sequential; dict for multi_table
metadata Metadata Metadata object describing the dataset schema

Usage Examples

Single Table Demo

from sdv.datasets.demo import download_demo

# Download a single-table demo dataset
data, metadata = download_demo(
    modality='single_table',
    dataset_name='fake_hotel_guests'
)

print(data.head())
print(metadata)

Multi Table Demo

from sdv.datasets.demo import download_demo

# Download a multi-table demo dataset
data, metadata = download_demo(
    modality='multi_table',
    dataset_name='fake_hotels'
)

# data is a dict of DataFrames
for table_name, df in data.items():
    print(f"{table_name}: {len(df)} rows")

Sequential Demo

from sdv.datasets.demo import download_demo

# Download a sequential demo dataset
data, metadata = download_demo(
    modality='sequential',
    dataset_name='nasdaq100_2019'
)

print(data.head())

Related Pages

Implements Principle

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment