Implementation:Pola rs Polars SQLContext Execute
Appearance
Overview
Concrete API usage for executing SQL queries against registered tables in a Polars SQLContext. This implementation covers the execute() method, supported SQL syntax, per-query eager overrides, and examples of SELECT, WHERE, GROUP BY, ORDER BY, JOIN, SQL functions, and table functions.
Metadata
| Field | Value |
|---|---|
| Namespace | Pola_rs_Polars |
| Workflow | SQL_Query_Interface |
| Implementation_ID | Pola_rs_Polars_SQLContext_Execute |
| Type | Implementation |
| Category | Data Access / Query Interface |
| Stage | Query Execution |
| last_updated | 2026-02-09 10:00 GMT |
| Source_Repository | https://github.com/pola-rs/polars |
| Source_File | docs/source/src/python/user-guide/sql/select.py:L23-106 |
| Documentation | https://docs.pola.rs |
API Signatures
Execute Query
SQLContext.execute(query, eager=None) -> LazyFrame or DataFrame
Key Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
| query | str | (required) | SQL query string to parse, compile, and execute against registered tables. |
| eager | bool or None | None | Override the context-level eager setting for this specific query. When None, uses the context default. When True, returns a DataFrame. When False, returns a LazyFrame. |
Supported SQL Constructs
| Construct | Description | Example Fragment |
|---|---|---|
| SELECT | Column selection, expressions, aliases, wildcard | SELECT city, population AS pop |
| WHERE | Row filtering with boolean predicates | WHERE population > 1000000 |
| GROUP BY | Aggregation by grouping columns | GROUP BY country |
| ORDER BY | Result ordering | ORDER BY population DESC |
| LIMIT | Restrict result set size | LIMIT 5 |
| LEFT JOIN | Left outer join between tables | FROM a LEFT JOIN b ON a.key = b.key |
| INNER JOIN | Inner join between tables | FROM a INNER JOIN b ON a.key = b.key |
| SQL Functions | AVG, SUM, COUNT, MIN, MAX, STARTS_WITH, etc. | AVG(population), STARTS_WITH(country, 'U') |
| Table Functions | Inline file access in FROM clause | FROM read_csv('data.csv') |
I/O Contract
| Direction | Type | Description |
|---|---|---|
| Input | str | SQL query string |
| Input | bool (optional) | Per-query eager override |
| Output | LazyFrame | When eager is False or not set (default behavior) |
| Output | DataFrame | When eager is True (immediate materialization) |
Implementation Examples
Basic SELECT with LIMIT
import polars as pl
df = pl.DataFrame({
"city": ["NYC", "LA", "Chicago", "Houston", "Phoenix"],
"country": ["USA", "USA", "USA", "USA", "USA"],
"population": [8336817, 3979576, 2693976, 2320268, 1680992],
})
ctx = pl.SQLContext(population=df, eager=True)
# Basic SELECT with LIMIT
result = ctx.execute("SELECT * FROM population LIMIT 5")
GROUP BY with Aggregate Functions
import polars as pl
ctx = pl.SQLContext(population=df, eager=True)
# GROUP BY with AVG aggregate
result = ctx.execute(
"SELECT country, AVG(population) as avg_pop FROM population GROUP BY country"
)
JOIN Between Tables
import polars as pl
income_df = pl.DataFrame({
"city": ["NYC", "LA", "Chicago"],
"median_income": [63998, 62142, 58247],
})
ctx = pl.SQLContext(population=df, eager=True)
ctx.register_many(income=income_df)
# LEFT JOIN between two registered tables
result = ctx.execute("""
SELECT income.*, population.population
FROM population LEFT JOIN income ON population.city = income.city
""")
SQL String Functions
import polars as pl
ctx = pl.SQLContext(population=df, eager=True)
# Using STARTS_WITH function in WHERE clause
result = ctx.execute(
"SELECT city, population FROM population WHERE STARTS_WITH(country, 'U')"
)
Table Functions (Inline File Access)
import polars as pl
ctx = pl.SQLContext(eager=True)
# Read a CSV file directly in the SQL query
result = ctx.execute("SELECT * FROM read_csv('data.csv')")
Per-Query Eager Override
import polars as pl
# Context defaults to lazy (eager=False)
ctx = pl.SQLContext(population=df)
# Override to get immediate DataFrame result for this query
result_df = ctx.execute("SELECT * FROM population LIMIT 3", eager=True)
# Default lazy behavior returns a LazyFrame
result_lf = ctx.execute("SELECT * FROM population LIMIT 3")
Behavioral Notes
- The query parameter accepts a single SQL statement. Multiple statements separated by semicolons are not supported in a single execute() call.
- Column aliases (using AS) in the SELECT clause are reflected in the output frame's column names.
- Table aliases can be used in JOIN queries to disambiguate column references.
- When eager=None (the default for the parameter), the context-level eager setting takes effect. Explicitly passing eager=True or eager=False overrides the context setting for that specific call.
- Table functions like read_csv() create a scan plan inline, benefiting from the same optimizations as pre-registered file scans.
- SQL keywords are case-insensitive (SELECT, select, and SeLeCt are all valid).
Relationships
See Also
- Implementation:Pola_rs_Polars_SQLContext_Constructor — Creating the SQLContext
- Implementation:Pola_rs_Polars_SQLContext_Register — Registering tables for query resolution
- Implementation:Pola_rs_Polars_SQLContext_CTE_and_DDL — CTEs and DDL operations
- Implementation:Pola_rs_Polars_SQL_Collect_and_Output — Collecting and writing query results
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment