Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Eventual Inc Daft Column Transformation

From Leeroopedia


Knowledge Sources
Domains Data_Engineering, Data_Transformation
Last Updated 2026-02-08 00:00 GMT

Overview

Technique for adding or replacing columns in a DataFrame using computed expressions.

Description

Column transformation appends a new column (or replaces an existing one) by evaluating an expression over the existing data. This is the primary mechanism for applying UDFs, built-in functions, and arithmetic operations to create derived columns. If the column name matches an existing column, the old column is replaced; otherwise, a new column is appended to the schema. The operation is equivalent to a SELECT of all existing columns plus the new expression aliased to the given name.

Usage

Use column transformation when you need to add computed columns or apply transformations to existing data. Common scenarios include feature engineering, data cleaning (e.g., normalizing values), deriving new metrics from existing columns, and applying user-defined functions.

Theoretical Basis

Column transformation implements a projection extension operation:

Pseudocode:
  with_column(df, name, expr):
    return SELECT df.*, expr AS name FROM df

Semantics:
  - If 'name' exists in df.schema:
      Replace column 'name' with evaluated 'expr'
  - If 'name' does not exist:
      Append new column 'name' with evaluated 'expr'

Expression Evaluation:
  - expr is evaluated row-wise over the existing columns
  - Can reference any existing column
  - Supports arithmetic, string ops, UDFs, conditionals
  - Result must have same number of rows as input

This operation extends the relational projection by preserving all existing columns and adding (or replacing) exactly one derived column.

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment