Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Eventual Inc Daft Daft Func

From Leeroopedia


Knowledge Sources
Domains Data_Engineering, User_Defined_Functions
Last Updated 2026-02-08 00:00 GMT

Overview

Concrete tool for decorating Python functions as row-wise user-defined functions provided by the Daft library.

Description

The @daft.func decorator converts a Python function into a Daft user-defined function that operates row-by-row. Decorated functions accept both their original argument types and Daft Expressions. When any arguments are Expressions, they return a Daft Expression that can be used in DataFrame operations. When called without Expression arguments, they execute immediately. The decorator supports three variants: row-wise (default), async row-wise (for async functions), and generator (for generator functions producing multiple output rows per input row).

Usage

Import via import daft and apply the @daft.func decorator to any Python function. Use when you need per-row custom logic in DataFrame pipelines.

Code Reference

Source Location

  • Repository: Daft
  • File: daft/udf/__init__.py
  • Lines: L21-227

Signature

@daft.func(
    *,
    return_dtype: DataTypeLike | None = None,
    unnest: bool = False,
    use_process: bool | None = None,
    max_retries: int | None = None,
    on_error: Literal["raise", "log", "ignore"] | None = None,
)

Import

import daft

@daft.func
def my_fn(a: int, b: int) -> int:
    return a + b

I/O Contract

Inputs

Name Type Required Description
return_dtype None No The data type the function returns. If not specified, inferred from type hints.
unnest bool No Whether to unnest/flatten struct return type fields into columns. Defaults to False.
use_process None No Whether to run each instance in a separate process. Daft auto-selects if unset.
max_retries None No Maximum number of retries on failure.
on_error None No Error handling strategy.

Outputs

Name Type Description
return Func wrapper A Func wrapper that can be used as an Expression in DataFrame operations such as select, with_column, and filter.

Usage Examples

Basic Usage

import daft

@daft.func
def my_sum(a: int, b: int) -> int:
    return a + b

df = daft.from_pydict({"x": [1, 2, 3], "y": [4, 5, 6]})
df.select(my_sum(df["x"], df["y"])).collect()

Async Usage

import daft
import asyncio

@daft.func
async def my_async_sum(a: int, b: int) -> int:
    await asyncio.sleep(0.1)
    return a + b

df = daft.from_pydict({"x": [1], "y": [2]})
df.select(my_async_sum(df["x"], df["y"])).collect()

Generator Usage

import daft
from typing import Iterator

@daft.func
def repeat(value: str, n: int) -> Iterator[str]:
    for _ in range(n):
        yield value

df = daft.from_pydict({"value": ["hello"], "n": [3]})
df.select(repeat(df["value"], df["n"])).collect()

Related Pages

Implements Principle

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment