Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Eventual Inc Daft DataFrame Where

From Leeroopedia


Knowledge Sources
Domains Data_Engineering, Data_Transformation
Last Updated 2026-02-08 00:00 GMT

Overview

Concrete tool for filtering DataFrame rows based on a boolean predicate expression provided by the Daft library.

Description

The where method on Daft's DataFrame class filters rows by evaluating a predicate expression and retaining only rows where the predicate is True. It accepts either a Daft Expression object or a SQL expression string (parsed via sql_expr). Rows where the predicate evaluates to False or Null are discarded. The filter method is an alias that delegates to where.

Usage

Use df.where() or equivalently df.filter() when you need to subset a DataFrame based on conditions. This is a method on DataFrame instances and requires no additional imports beyond Daft itself.

Code Reference

Source Location

  • Repository: Daft
  • File: daft/dataframe/dataframe.py
  • Lines: L2357-2405

Signature

def where(self, predicate: Expression | str) -> DataFrame

Import

import daft

# Method on DataFrame - no separate import needed
df.where(daft.col("x") > 5)
df.where("x > 5")  # SQL expression string

I/O Contract

Inputs

Name Type Required Description
predicate Expression or str Yes Boolean predicate expression. Can be a Daft Expression or a SQL expression string.

Outputs

Name Type Description
return DataFrame A new DataFrame containing only rows where the predicate evaluated to True

Usage Examples

Basic Usage

import daft

df = daft.from_pydict({"x": [1, 2, 3], "y": [4, 6, 6], "z": [7, 8, 9]})

# Filter with Expression predicate
result = df.where((df["x"] > 1) & (df["y"] > 1))
result.collect()
# Output:
# x: [2, 3]
# y: [6, 6]
# z: [8, 9]

SQL Expression String

import daft

df = daft.from_pydict({"x": [1, 2, 3], "y": [4, 5, 6], "z": [7, 9, 9]})

# Filter using a SQL expression string
result = df.where("z = 9 AND y > 5")
result.collect()
# Output:
# x: [3]
# y: [6]
# z: [9]

Related Pages

Implements Principle

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment