Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Eventual Inc Daft DataFrame Write Iceberg

From Leeroopedia


Knowledge Sources
Domains Data_Engineering, Data_Lakehouse
Last Updated 2026-02-08 00:00 GMT

Overview

Concrete tool for writing DataFrame contents to an Apache Iceberg table with transactional guarantees provided by the Daft library.

Description

The write_iceberg method on a Daft DataFrame writes data to an Iceberg table in either append or overwrite mode. The operation is blocking: it executes the DataFrame, produces data files, and atomically commits them to the Iceberg table metadata through a transaction. For overwrite mode, existing files are marked for deletion before new files are appended, all within a single transaction. The method supports partitioned tables (requires pyiceberg >= 0.7.0) and manifest merging for append operations. It returns a metadata DataFrame containing operation details (ADD/DELETE actions, row counts, file sizes, and partition values). Requires pyiceberg >= 0.6.0 and pyarrow >= 12.0.1.

Usage

Use this method on a DataFrame when you need to persist processed data to an Iceberg table with ACID guarantees. This call is blocking and will execute the DataFrame immediately.

Code Reference

Source Location

  • Repository: Daft
  • File: daft/dataframe/dataframe.py
  • Lines: L1035-1195

Signature

def write_iceberg(
    self,
    table: "pyiceberg.table.Table",
    mode: str = "append",
    io_config: IOConfig | None = None,
) -> "DataFrame"

Import

# Method on DataFrame, no separate import needed
df.write_iceberg(iceberg_table, mode="append")

I/O Contract

Inputs

Name Type Required Description
table pyiceberg.table.Table Yes Destination PyIceberg Table to write data to
mode str No Operation mode: "append" to add rows or "overwrite" to replace existing data; defaults to "append"
io_config None No Custom IO configuration; defaults to table's file IO properties

Outputs

Name Type Description
return DataFrame A metadata DataFrame with columns: operation (ADD/DELETE), rows (int), file_size (int), file_name (str), and optionally partitioning (struct)

Usage Examples

Basic Usage

import daft

# Write data to an Iceberg table (append mode)
df = daft.from_pydict({"user_id": [1, 2, 3], "name": ["Alice", "Bob", "Charlie"]})
result = df.write_iceberg(iceberg_table, mode="append")
result.show()  # Shows ADD operations with row counts and file sizes

# Overwrite existing data in an Iceberg table
result = df.write_iceberg(iceberg_table, mode="overwrite")
result.show()  # Shows DELETE operations for old files and ADD for new files

Related Pages

Implements Principle

Requires Environment

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment