Implementation:Eventual Inc Daft Window Specification
| Knowledge Sources | |
|---|---|
| Domains | Data_Engineering, Data_Analysis |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
Concrete tool for defining window specifications for window function computations provided by the Daft library.
Description
The Window class describes how to partition data and in what order to apply window functions. It provides a fluent builder API with three main methods: partition_by() to define grouping columns, order_by() to specify row ordering within partitions (with configurable ascending/descending and null placement), and rows_between() to define row-based frame boundaries. The class also provides class-level constants for frame boundaries: unbounded_preceding, unbounded_following, and current_row. Window specifications are applied to expressions using the .over(window) method. A range_between() method is also available for value-based frame boundaries.
Usage
Import and use this class when you need to compute window functions such as running totals, moving averages, rankings, or lead/lag operations over partitioned and ordered data.
Code Reference
Source Location
- Repository: Daft
- File:
daft/window.py - Lines: L12-259
Signature
class Window:
# Class-level constants
unbounded_preceding = _PyWindowBoundary.unbounded_preceding()
unbounded_following = _PyWindowBoundary.unbounded_following()
current_row = _PyWindowBoundary.offset(0)
def partition_by(self, *cols: ManyColumnsInputType) -> Window
def order_by(self, *cols: ManyColumnsInputType, desc: bool | list[bool] = False, nulls_first: bool | list[bool] | None = None) -> Window
def rows_between(self, start: int | _PyWindowBoundary, end: int | _PyWindowBoundary, min_periods: int = 1) -> Window
def range_between(self, start: Any, end: Any, min_periods: int = 1) -> Window
Import
from daft import Window
# or
import daft
window = daft.Window()
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| partition_by cols | ManyColumnsInputType | No | Columns or expressions to partition by; column names as strings or Expression objects |
| order_by cols | ManyColumnsInputType | No | Columns or expressions to order by within each partition |
| desc | list[bool] | No | Sort descending (True) or ascending (False); defaults to False |
| nulls_first | list[bool] | None | No | Position nulls at beginning (True) or end (False); defaults to matching desc |
| rows_between start | _PyWindowBoundary | No | Start boundary for row-based frame (negative = preceding, 0 = current, positive = following) |
| rows_between end | _PyWindowBoundary | No | End boundary for row-based frame |
| min_periods | int | No | Minimum rows required in frame to compute a result; defaults to 1 |
Outputs
| Name | Type | Description |
|---|---|---|
| return | Window | A window specification for use with .over(window) on expressions
|
Usage Examples
Basic Usage
from daft import Window, col
# Basic window aggregation partitioned by category
window_spec = Window().partition_by("category")
df = df.select(
col("value").sum().over(window_spec).alias("category_total"),
col("value").mean().over(window_spec).alias("category_avg"),
)
# Percent of category total
window_spec = Window().partition_by("category")
df = df.select(
(col("value") / col("value").sum().over(window_spec)).alias("pct_of_category")
)
# Sliding window: current row and 2 preceding rows
window_spec = Window().partition_by("cat").order_by("val").rows_between(-2, Window.current_row)
# Cumulative window: from beginning of partition to current row
cum_window = (
Window()
.partition_by("cat")
.order_by("val")
.rows_between(Window.unbounded_preceding, Window.current_row)
)