Overview
Concrete tool for reading and writing CSV and Excel files as multi-table dictionaries provided by the SDV library.
Description
The sdv.io.local.local module provides BaseLocalHandler, CSVHandler, and ExcelHandler classes for reading tabular data from local files (CSV or Excel) into dictionaries of pandas DataFrames, and writing synthetic data back to files. CSVHandler supports leading-zero preservation, configurable read/write parameters, and multiple write modes (create, overwrite, append). ExcelHandler supports multi-sheet Excel files with similar read/write semantics.
Usage
Import CSVHandler when loading multi-table CSV datasets from a folder or when saving synthesized data to CSV files. Import ExcelHandler for Excel file workflows. Both handlers integrate with SDV's Metadata detection via the inherited create_metadata method.
Code Reference
Source Location
Signature
class BaseLocalHandler:
def __init__(self, decimal='.', float_format=None): ...
def create_metadata(self, data) -> Metadata: ...
def read(self): ...
def write(self): ...
class CSVHandler(BaseLocalHandler):
def __init__(self): ...
def read(
self,
folder_name: str,
file_names: list = None,
read_csv_parameters: dict = None,
keep_leading_zeros: bool = True,
) -> dict: ...
def write(
self,
synthetic_data: dict,
folder_name: str,
file_name_suffix: str = None,
mode: str = 'x',
to_csv_parameters: dict = None,
) -> None: ...
class ExcelHandler(BaseLocalHandler):
def __init__(self, decimal='.', float_format=None): ...
def read(self, filepath: str, sheet_names: list = None) -> dict: ...
def write(
self,
synthetic_data: dict,
filepath: str,
sheet_name_suffix: str = None,
mode: str = 'w',
) -> None: ...
Import
from sdv.io.local import CSVHandler, ExcelHandler
I/O Contract
Inputs (CSVHandler.read)
| Name |
Type |
Required |
Description
|
| folder_name |
str |
Yes |
Path to folder containing CSV files
|
| file_names |
list[str] |
No |
Specific CSV files to read; if None reads all .csv files
|
| read_csv_parameters |
dict |
No |
Extra parameters passed to pandas.read_csv
|
| keep_leading_zeros |
bool |
No |
Preserve leading zeros in numeric-looking columns (default True)
|
Outputs (CSVHandler.read)
| Name |
Type |
Description
|
| data |
dict[str, DataFrame] |
Maps table names (file stems) to pandas DataFrames
|
Inputs (CSVHandler.write)
| Name |
Type |
Required |
Description
|
| synthetic_data |
dict[str, DataFrame] |
Yes |
Table name to DataFrame mapping
|
| folder_name |
str |
Yes |
Output folder path
|
| file_name_suffix |
str |
No |
Suffix appended to each file name
|
| mode |
str |
No |
Write mode: 'x' (new only), 'w' (overwrite), 'a' (append)
|
| to_csv_parameters |
dict |
No |
Extra parameters passed to pandas.to_csv
|
Usage Examples
Reading CSV Files
from sdv.io.local import CSVHandler
handler = CSVHandler()
# Read all CSV files in a folder
data = handler.read(folder_name='./my_data/')
# Read specific files with custom parameters
data = handler.read(
folder_name='./my_data/',
file_names=['users.csv', 'orders.csv'],
read_csv_parameters={'encoding': 'utf-8'},
)
# Create metadata from loaded data
metadata = handler.create_metadata(data)
Writing Synthetic Data
from sdv.io.local import CSVHandler
handler = CSVHandler()
# Write synthetic data to new CSV files
handler.write(
synthetic_data={'users': synthetic_users_df, 'orders': synthetic_orders_df},
folder_name='./synthetic_output/',
mode='w',
)
Excel Handler
from sdv.io.local import ExcelHandler
handler = ExcelHandler(decimal='.', float_format='%.4f')
# Read all sheets from Excel file
data = handler.read(filepath='./data.xlsx')
# Write synthetic data to Excel
handler.write(
synthetic_data={'Sheet1': df1, 'Sheet2': df2},
filepath='./synthetic.xlsx',
)
Related Pages