Principle:Eventual Inc Daft SQL Query Execution
| Knowledge Sources | |
|---|---|
| Domains | Data_Engineering, SQL |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
Technique for executing SQL queries against DataFrames and catalog tables in Daft.
Description
SQL query execution parses SQL strings through Daft's Rust-based SQL engine, resolves table references (from global DataFrames or catalogs), and produces a lazy DataFrame query plan. Supports standard SQL syntax including SELECT, JOIN, GROUP BY, and window functions. By default, Python DataFrames in the caller's scope are automatically registered as table references, enabling seamless SQL-DataFrame interop. Additional CTE bindings can be provided explicitly.
Usage
Use SQL query execution when you prefer SQL syntax for data querying, need to integrate with SQL-based workflows, or want to join multiple DataFrames using SQL JOIN syntax.
Theoretical Basis
SQL parsing and planning through a Rust-native SQL engine that produces optimized logical plans:
daft.sql(query_string):
1. Collect CTE bindings:
a. Scan caller's global/local Python variables for DataFrames
b. Add explicit **bindings (override globals)
2. Parse SQL string via Rust sql_exec engine
3. Resolve table references against:
a. CTE bindings (Python DataFrames)
b. Session catalogs and namespaces
4. Return lazy DataFrame (logical plan, not yet executed)
The result is a lazy DataFrame that benefits from Daft's query optimizer before execution.