Implementation:Eventual Inc Daft DataFrame Where
| Knowledge Sources | |
|---|---|
| Domains | Data_Engineering, Data_Transformation |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
Concrete tool for filtering DataFrame rows based on a boolean predicate expression provided by the Daft library.
Description
The where method on Daft's DataFrame class filters rows by evaluating a predicate expression and retaining only rows where the predicate is True. It accepts either a Daft Expression object or a SQL expression string (parsed via sql_expr). Rows where the predicate evaluates to False or Null are discarded. The filter method is an alias that delegates to where.
Usage
Use df.where() or equivalently df.filter() when you need to subset a DataFrame based on conditions. This is a method on DataFrame instances and requires no additional imports beyond Daft itself.
Code Reference
Source Location
- Repository: Daft
- File:
daft/dataframe/dataframe.py - Lines: L2357-2405
Signature
def where(self, predicate: Expression | str) -> DataFrame
Import
import daft
# Method on DataFrame - no separate import needed
df.where(daft.col("x") > 5)
df.where("x > 5") # SQL expression string
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| predicate | Expression or str | Yes | Boolean predicate expression. Can be a Daft Expression or a SQL expression string. |
Outputs
| Name | Type | Description |
|---|---|---|
| return | DataFrame | A new DataFrame containing only rows where the predicate evaluated to True |
Usage Examples
Basic Usage
import daft
df = daft.from_pydict({"x": [1, 2, 3], "y": [4, 6, 6], "z": [7, 8, 9]})
# Filter with Expression predicate
result = df.where((df["x"] > 1) & (df["y"] > 1))
result.collect()
# Output:
# x: [2, 3]
# y: [6, 6]
# z: [8, 9]
SQL Expression String
import daft
df = daft.from_pydict({"x": [1, 2, 3], "y": [4, 5, 6], "z": [7, 9, 9]})
# Filter using a SQL expression string
result = df.where("z = 9 AND y > 5")
result.collect()
# Output:
# x: [3]
# y: [6]
# z: [9]