Implementation:Eventual Inc Daft DataFrame Select
| Knowledge Sources | |
|---|---|
| Domains | Data_Engineering, Data_Transformation |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
Concrete tool for projecting specific columns or computed expressions from a DataFrame provided by the Daft library.
Description
The select method on Daft's DataFrame class creates a new DataFrame containing only the specified columns or expressions, similar to a SQL SELECT clause. It accepts column names as strings, Expression objects, or keyword argument projections that create named computed columns. The resulting DataFrame's schema contains only the selected columns in the order specified.
Usage
Use df.select() when you need to narrow a DataFrame to specific columns or compute new columns while dropping all others. This is a method on DataFrame instances and requires no additional imports beyond Daft itself.
Code Reference
Source Location
- Repository: Daft
- File:
daft/dataframe/dataframe.py - Lines: L2009-2041
Signature
def select(self, *columns: ColumnInputType, **projections: Expression) -> DataFrame
Import
import daft
# Method on DataFrame - no separate import needed
df.select("col1", "col2")
df.select(daft.col("x"), daft.col("y") + 1)
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| *columns | ColumnInputType | Yes | One or more columns to select, specified as strings or Expression objects |
| **projections | Expression | No | Named keyword projections that create computed columns with the keyword as the column name |
Outputs
| Name | Type | Description |
|---|---|---|
| return | DataFrame | A new DataFrame containing only the selected columns and projections |
Usage Examples
Basic Usage
import daft
df = daft.from_pydict({"x": [1, 2, 3], "y": [4, 5, 6], "z": [7, 8, 9]})
# Select specific columns
result = df.select("x", "y")
result.show()
# Output:
# x: [1, 2, 3]
# y: [4, 5, 6]
With Computed Expressions
import daft
df = daft.from_pydict({"x": [1, 2, 3], "y": [4, 5, 6], "z": [7, 8, 9]})
# Select with expressions and computed columns
result = df.select("x", daft.col("y"), daft.col("z") + 1)
result.show()
# Output:
# x: [1, 2, 3]
# y: [4, 5, 6]
# z: [8, 9, 10]