Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Eventual Inc Daft DataFrame Write Deltalake

From Leeroopedia


Knowledge Sources
Domains Data_Engineering, Data_Lakehouse
Last Updated 2026-02-08 00:00 GMT

Overview

Concrete tool for writing DataFrame contents to a Delta Lake table with transactional semantics provided by the Daft library.

Description

The write_deltalake method on a Daft DataFrame writes data to a Delta Lake table, supporting four write modes: append (add new data), overwrite (replace with new data), error (fail if table exists), and ignore (no-op if table exists). The operation handles both creating new tables and writing to existing ones. Schema enforcement validates data compatibility, with optional schema overwrite for overwrite mode. Partition columns can be specified for new tables or must match existing table partitioning. The method supports DynamoDB-based locking for safe concurrent writes on S3, and unsafe rename for local or S3 storage when locking is not configured. The call is blocking and returns a metadata DataFrame with operation details. Requires deltalake >= 0.14.0.

Usage

Use this method on a DataFrame when you need to persist processed data to a Delta Lake table. This call is blocking and will execute the DataFrame immediately.

Code Reference

Source Location

  • Repository: Daft
  • File: daft/dataframe/dataframe.py
  • Lines: L1198-1448

Signature

def write_deltalake(
    self,
    table: Union[str, pathlib.Path, "DataCatalogTable", "deltalake.DeltaTable", "UnityCatalogTable"],
    partition_cols: list[str] | None = None,
    mode: Literal["append", "overwrite", "error", "ignore"] = "append",
    schema_mode: Literal["merge", "overwrite"] | None = None,
    name: str | None = None,
    description: str | None = None,
    configuration: Mapping[str, str | None] | None = None,
    custom_metadata: dict[str, str] | None = None,
    dynamo_table_name: str | None = None,
    allow_unsafe_rename: bool = False,
    io_config: IOConfig | None = None,
) -> "DataFrame"

Import

# Method on DataFrame, no separate import needed
df.write_deltalake("s3://bucket/table")

I/O Contract

Inputs

Name Type Required Description
table Path | DataCatalogTable | DeltaTable | UnityCatalogTable Yes Destination Delta Lake table URI or table object
partition_cols None No Columns to partition by; must match existing table partitioning if table exists
mode Literal["append","overwrite","error","ignore"] No Write mode; defaults to "append"
schema_mode None No Schema mode for overwrite; "merge" is not currently supported
name None No User-provided identifier for the table
description None No User-provided description for the table
configuration None] | None No Configuration options for the metadata action
custom_metadata None No Custom metadata to add to commit info
dynamo_table_name None No DynamoDB table name for S3 locking provider
allow_unsafe_rename bool No Allow unsafe rename on S3 or local disk; defaults to False
io_config None No Custom IO configuration for remote storage

Outputs

Name Type Description
return DataFrame A metadata DataFrame with columns: operation (ADD/DELETE), rows (int), file_size (int), file_name (str)

Usage Examples

Basic Usage

import daft

df = daft.from_pydict({"x": [1, 2, 3], "y": ["a", "b", "c"]})

# Write to a Delta Lake table (append mode)
result = df.write_deltalake("s3://my-bucket/my-deltalake-table")

# Overwrite existing table data
result = df.write_deltalake("s3://my-bucket/my-deltalake-table", mode="overwrite")

# Write with partitioning and DynamoDB locking for S3
result = df.write_deltalake(
    "s3://my-bucket/my-table",
    partition_cols=["y"],
    mode="append",
    dynamo_table_name="my-lock-table",
)

Related Pages

Implements Principle

Requires Environment

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment