Implementation:Eventual Inc Daft DataFrame Write Iceberg

Knowledge Sources	Daft Daft Docs
Domains	Data_Engineering, Data_Lakehouse
Last Updated	2026-02-08 00:00 GMT

Overview

Concrete tool for writing DataFrame contents to an Apache Iceberg table with transactional guarantees provided by the Daft library.

Description

The write_iceberg method on a Daft DataFrame writes data to an Iceberg table in either append or overwrite mode. The operation is blocking: it executes the DataFrame, produces data files, and atomically commits them to the Iceberg table metadata through a transaction. For overwrite mode, existing files are marked for deletion before new files are appended, all within a single transaction. The method supports partitioned tables (requires pyiceberg >= 0.7.0) and manifest merging for append operations. It returns a metadata DataFrame containing operation details (ADD/DELETE actions, row counts, file sizes, and partition values). Requires pyiceberg >= 0.6.0 and pyarrow >= 12.0.1.

Usage

Use this method on a DataFrame when you need to persist processed data to an Iceberg table with ACID guarantees. This call is blocking and will execute the DataFrame immediately.

Code Reference

Source Location

Repository: Daft
File: daft/dataframe/dataframe.py
Lines: L1035-1195

Signature

def write_iceberg(
    self,
    table: "pyiceberg.table.Table",
    mode: str = "append",
    io_config: IOConfig | None = None,
) -> "DataFrame"

Import

# Method on DataFrame, no separate import needed
df.write_iceberg(iceberg_table, mode="append")

I/O Contract

Inputs

Name	Type	Required	Description
table	pyiceberg.table.Table	Yes	Destination PyIceberg Table to write data to
mode	str	No	Operation mode: "append" to add rows or "overwrite" to replace existing data; defaults to "append"
io_config	None	No	Custom IO configuration; defaults to table's file IO properties

Outputs

Name	Type	Description
return	DataFrame	A metadata DataFrame with columns: operation (ADD/DELETE), rows (int), file_size (int), file_name (str), and optionally partitioning (struct)

Usage Examples

Basic Usage

import daft

# Write data to an Iceberg table (append mode)
df = daft.from_pydict({"user_id": [1, 2, 3], "name": ["Alice", "Bob", "Charlie"]})
result = df.write_iceberg(iceberg_table, mode="append")
result.show()  # Shows ADD operations with row counts and file sizes

# Overwrite existing data in an Iceberg table
result = df.write_iceberg(iceberg_table, mode="overwrite")
result.show()  # Shows DELETE operations for old files and ADD for new files

Related Pages

Implements Principle

Principle:Eventual_Inc_Daft_Iceberg_Writing

Requires Environment

Uses Heuristic

Heuristic:Eventual_Inc_Daft_Execution_Config_Tuning

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment