Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Sdv dev SDV CSVHandler Read Write

From Leeroopedia
Knowledge Sources
Domains Data_IO, File_Handling
Last Updated 2026-02-14 19:00 GMT

Overview

Concrete tool for reading and writing CSV and Excel files as multi-table dictionaries provided by the SDV library.

Description

The sdv.io.local.local module provides BaseLocalHandler, CSVHandler, and ExcelHandler classes for reading tabular data from local files (CSV or Excel) into dictionaries of pandas DataFrames, and writing synthetic data back to files. CSVHandler supports leading-zero preservation, configurable read/write parameters, and multiple write modes (create, overwrite, append). ExcelHandler supports multi-sheet Excel files with similar read/write semantics.

Usage

Import CSVHandler when loading multi-table CSV datasets from a folder or when saving synthesized data to CSV files. Import ExcelHandler for Excel file workflows. Both handlers integrate with SDV's Metadata detection via the inherited create_metadata method.

Code Reference

Source Location

Signature

class BaseLocalHandler:
    def __init__(self, decimal='.', float_format=None): ...
    def create_metadata(self, data) -> Metadata: ...
    def read(self): ...
    def write(self): ...

class CSVHandler(BaseLocalHandler):
    def __init__(self): ...
    def read(
        self,
        folder_name: str,
        file_names: list = None,
        read_csv_parameters: dict = None,
        keep_leading_zeros: bool = True,
    ) -> dict: ...
    def write(
        self,
        synthetic_data: dict,
        folder_name: str,
        file_name_suffix: str = None,
        mode: str = 'x',
        to_csv_parameters: dict = None,
    ) -> None: ...

class ExcelHandler(BaseLocalHandler):
    def __init__(self, decimal='.', float_format=None): ...
    def read(self, filepath: str, sheet_names: list = None) -> dict: ...
    def write(
        self,
        synthetic_data: dict,
        filepath: str,
        sheet_name_suffix: str = None,
        mode: str = 'w',
    ) -> None: ...

Import

from sdv.io.local import CSVHandler, ExcelHandler

I/O Contract

Inputs (CSVHandler.read)

Name Type Required Description
folder_name str Yes Path to folder containing CSV files
file_names list[str] No Specific CSV files to read; if None reads all .csv files
read_csv_parameters dict No Extra parameters passed to pandas.read_csv
keep_leading_zeros bool No Preserve leading zeros in numeric-looking columns (default True)

Outputs (CSVHandler.read)

Name Type Description
data dict[str, DataFrame] Maps table names (file stems) to pandas DataFrames

Inputs (CSVHandler.write)

Name Type Required Description
synthetic_data dict[str, DataFrame] Yes Table name to DataFrame mapping
folder_name str Yes Output folder path
file_name_suffix str No Suffix appended to each file name
mode str No Write mode: 'x' (new only), 'w' (overwrite), 'a' (append)
to_csv_parameters dict No Extra parameters passed to pandas.to_csv

Usage Examples

Reading CSV Files

from sdv.io.local import CSVHandler

handler = CSVHandler()

# Read all CSV files in a folder
data = handler.read(folder_name='./my_data/')

# Read specific files with custom parameters
data = handler.read(
    folder_name='./my_data/',
    file_names=['users.csv', 'orders.csv'],
    read_csv_parameters={'encoding': 'utf-8'},
)

# Create metadata from loaded data
metadata = handler.create_metadata(data)

Writing Synthetic Data

from sdv.io.local import CSVHandler

handler = CSVHandler()

# Write synthetic data to new CSV files
handler.write(
    synthetic_data={'users': synthetic_users_df, 'orders': synthetic_orders_df},
    folder_name='./synthetic_output/',
    mode='w',
)

Excel Handler

from sdv.io.local import ExcelHandler

handler = ExcelHandler(decimal='.', float_format='%.4f')

# Read all sheets from Excel file
data = handler.read(filepath='./data.xlsx')

# Write synthetic data to Excel
handler.write(
    synthetic_data={'Sheet1': df1, 'Sheet2': df2},
    filepath='./synthetic.xlsx',
)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment