Implementation:Eventual Inc Daft DataFrame Write Deltalake
| Knowledge Sources | |
|---|---|
| Domains | Data_Engineering, Data_Lakehouse |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
Concrete tool for writing DataFrame contents to a Delta Lake table with transactional semantics provided by the Daft library.
Description
The write_deltalake method on a Daft DataFrame writes data to a Delta Lake table, supporting four write modes: append (add new data), overwrite (replace with new data), error (fail if table exists), and ignore (no-op if table exists). The operation handles both creating new tables and writing to existing ones. Schema enforcement validates data compatibility, with optional schema overwrite for overwrite mode. Partition columns can be specified for new tables or must match existing table partitioning. The method supports DynamoDB-based locking for safe concurrent writes on S3, and unsafe rename for local or S3 storage when locking is not configured. The call is blocking and returns a metadata DataFrame with operation details. Requires deltalake >= 0.14.0.
Usage
Use this method on a DataFrame when you need to persist processed data to a Delta Lake table. This call is blocking and will execute the DataFrame immediately.
Code Reference
Source Location
- Repository: Daft
- File:
daft/dataframe/dataframe.py - Lines: L1198-1448
Signature
def write_deltalake(
self,
table: Union[str, pathlib.Path, "DataCatalogTable", "deltalake.DeltaTable", "UnityCatalogTable"],
partition_cols: list[str] | None = None,
mode: Literal["append", "overwrite", "error", "ignore"] = "append",
schema_mode: Literal["merge", "overwrite"] | None = None,
name: str | None = None,
description: str | None = None,
configuration: Mapping[str, str | None] | None = None,
custom_metadata: dict[str, str] | None = None,
dynamo_table_name: str | None = None,
allow_unsafe_rename: bool = False,
io_config: IOConfig | None = None,
) -> "DataFrame"
Import
# Method on DataFrame, no separate import needed
df.write_deltalake("s3://bucket/table")
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| table | Path | DataCatalogTable | DeltaTable | UnityCatalogTable | Yes | Destination Delta Lake table URI or table object |
| partition_cols | None | No | Columns to partition by; must match existing table partitioning if table exists |
| mode | Literal["append","overwrite","error","ignore"] | No | Write mode; defaults to "append" |
| schema_mode | None | No | Schema mode for overwrite; "merge" is not currently supported |
| name | None | No | User-provided identifier for the table |
| description | None | No | User-provided description for the table |
| configuration | None] | None | No | Configuration options for the metadata action |
| custom_metadata | None | No | Custom metadata to add to commit info |
| dynamo_table_name | None | No | DynamoDB table name for S3 locking provider |
| allow_unsafe_rename | bool | No | Allow unsafe rename on S3 or local disk; defaults to False |
| io_config | None | No | Custom IO configuration for remote storage |
Outputs
| Name | Type | Description |
|---|---|---|
| return | DataFrame | A metadata DataFrame with columns: operation (ADD/DELETE), rows (int), file_size (int), file_name (str) |
Usage Examples
Basic Usage
import daft
df = daft.from_pydict({"x": [1, 2, 3], "y": ["a", "b", "c"]})
# Write to a Delta Lake table (append mode)
result = df.write_deltalake("s3://my-bucket/my-deltalake-table")
# Overwrite existing table data
result = df.write_deltalake("s3://my-bucket/my-deltalake-table", mode="overwrite")
# Write with partitioning and DynamoDB locking for S3
result = df.write_deltalake(
"s3://my-bucket/my-table",
partition_cols=["y"],
mode="append",
dynamo_table_name="my-lock-table",
)
Related Pages
Implements Principle
Requires Environment
- Environment:Eventual_Inc_Daft_Python_PyArrow_Core
- Environment:Eventual_Inc_Daft_Cloud_Storage_Credentials