Implementation:Eventual Inc Daft DataFrame Explode
| Knowledge Sources | |
|---|---|
| Domains | Data_Engineering, Data_Transformation |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
Concrete tool for expanding list-type columns into individual rows provided by the Daft library.
Description
The explode method on Daft's DataFrame class takes one or more list-typed columns and produces one row per list element, duplicating all other columns. When multiple columns are exploded simultaneously, each row must contain lists of equal length. Null values and empty lists produce a single row with a Null value. An optional index_column parameter adds a column tracking the position of each element within its original list.
Usage
Use df.explode() when you need to flatten list columns into individual rows. This is a method on DataFrame instances and requires no additional imports beyond Daft itself.
Code Reference
Source Location
- Repository: Daft
- File:
daft/dataframe/dataframe.py - Lines: L3106-3225
Signature
def explode(self, *columns: ColumnInputType, index_column: ColumnInputType | None = None) -> DataFrame
Import
import daft
# Method on DataFrame - no separate import needed
df.explode("list_col")
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| *columns | ColumnInputType | Yes | One or more list-typed columns to explode |
| index_column | ColumnInputType or None | No | Optional name for an index column that tracks the position of each element within its original list |
Outputs
| Name | Type | Description |
|---|---|---|
| return | DataFrame | A new DataFrame with list columns exploded into individual rows, with all other columns duplicated |
Usage Examples
Basic Usage
import daft
df = daft.from_pydict({
"x": [[1], [2, 3]],
"y": [["a"], ["b", "c"]],
})
# Explode multiple list columns simultaneously
result = df.explode(df["x"], df["y"])
result.collect()
# Output:
# x: [1, 2, 3]
# y: ["a", "b", "c"]
With Index Column
import daft
df = daft.from_pydict({"a": [[1, 2], [3, 4, 3]]})
# Track element positions with index_column
result = df.explode("a", index_column="idx")
result.collect()
# Output:
# a: [1, 2, 3, 4, 3]
# idx: [0, 1, 0, 1, 2]