Principle:Eventual Inc Daft List Explosion
| Knowledge Sources | |
|---|---|
| Domains | Data_Engineering, Data_Transformation |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
Technique for expanding list-type columns into individual rows in a dataframe.
Description
List explosion (also known as unnesting or flattening) takes rows containing list values and produces one row per list element, duplicating all other columns in the row. This is essential for normalizing nested data structures into flat tabular form suitable for aggregation, joining, and other relational operations. When multiple list columns are exploded simultaneously, each row must contain lists of equal length to maintain alignment. Null values and empty lists produce a single row with a Null value for the exploded column.
Usage
Use list explosion when you need to flatten list columns so each element becomes its own row. Common scenarios include normalizing JSON arrays, expanding multi-valued attributes for group-by operations, or preparing nested data for joins with flat tables.
Theoretical Basis
List explosion implements the relational unnest operation. Given a relation R with a list-valued attribute L, the unnest transforms a 1:N relationship within a column into N separate rows:
Pseudocode:
For each row r in R:
For each element e in r.L:
emit (r.other_columns, e)
If r.L is NULL or empty:
emit (r.other_columns, NULL)
The cardinality of the result is the sum of the lengths of all lists (with nulls and empties contributing 1 each). This operation preserves the values of all non-exploded columns by duplicating them for each list element.