Implementation:Eventual Inc Daft Daft Sql
| Knowledge Sources | |
|---|---|
| Domains | Data_Engineering, SQL |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
Concrete tool for executing SQL queries against DataFrames and returning results as a DataFrame provided by the Daft library.
Description
The daft.sql function runs a SQL query and returns the results as a DataFrame. It builds CTE (Common Table Expression) bindings in order: first from Python global/local DataFrame variables in the caller's scope (when register_globals=True), then from explicit **bindings keyword arguments which cannot be shadowed. The SQL string is parsed and executed via the Rust sql_exec function using the current session and planning configuration. If the result is None (non-data statement), an empty DataFrame is returned for backwards compatibility.
Usage
Import via import daft and call daft.sql("SELECT ..."). Use for SQL-based querying with automatic DataFrame variable detection.
Code Reference
Source Location
- Repository: Daft
- File:
daft/sql/sql.py - Lines: L77-180
Signature
def sql(
sql: str,
register_globals: bool = True,
**bindings: DataFrame,
) -> DataFrame
Import
import daft
df = daft.sql("SELECT * FROM my_table")
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| sql | str | Yes | SQL query string to execute. |
| register_globals | bool | No | Whether to auto-register global DataFrames as table references. Defaults to True. |
| **bindings | DataFrame | No | Additional DataFrame CTE bindings. Keys are table names, values are DataFrames. |
Outputs
| Name | Type | Description |
|---|---|---|
| return | DataFrame | A DataFrame representing the SQL query result. Returns an empty DataFrame for non-data statements. |
Usage Examples
Basic Usage
import daft
df1 = daft.from_pydict({"a": [1, 2, 3], "b": ["foo", "bar", "baz"]})
df2 = daft.from_pydict({"a": [1, 2, 3], "c": ["daft", None, None]})
# Daft automatically detects df1 and df2 from Python global namespace
result = daft.sql("SELECT * FROM df1 JOIN df2 ON df1.a = df2.a")
result.show()
Explicit CTE Bindings
import daft
df = daft.from_pydict({"a": [1, 2, 3], "b": ["foo", "bar", "baz"]})
# Register DataFrame with a custom name
daft.sql("SELECT a FROM my_df", my_df=df).show()