Principle:Pola rs Polars Dynamic Temporal Aggregation
| Knowledge Sources | |
|---|---|
| Domains | Data Engineering, Time Series |
| Last Updated | 2026-02-09 10:00 GMT |
Overview
Grouping time series data into temporal windows (yearly, monthly, hourly, etc.) for time-based aggregation with configurable window boundaries.
Description
Dynamic temporal aggregation partitions a time series into non-uniform or uniform temporal windows and computes aggregate statistics within each window. Unlike standard group_by which groups by exact key values, group_by_dynamic creates groups based on time intervals -- each group spans a temporal range defined by the window parameters.
The core parameters that control window behavior are:
every-- The step size between consecutive window start points. This defines the granularity of the output (e.g.,"1y"for yearly,"1mo"for monthly,"1h"for hourly).period-- The duration of each window. Defaults toevery, producing non-overlapping (tumbling) windows. Whenperiod > every, windows overlap (sliding windows); whenperiod < every, gaps appear between windows.closed-- Controls which boundary is inclusive:"left"(default),"right","both", or"none".group_by-- Optional additional grouping columns for panel data (multiple entities sharing the same temporal index).include_boundaries-- When True, adds_lower_boundaryand_upper_boundarycolumns to the output for debugging and verification.
This operation requires the temporal index column to be sorted in ascending order. The sorted order enables the engine to process windows in a single forward pass, maintaining a pair of pointers (window start, window end) that advance monotonically through the data.
Usage
Use this principle whenever you need to:
- Compute yearly, monthly, weekly, daily, or hourly summaries of time series data.
- Create overlapping (sliding) temporal windows for smoothed aggregations.
- Aggregate panel data (multiple entities) across shared temporal windows.
- Produce regularly-spaced output from irregularly-sampled input data.
Theoretical Basis
Dynamic grouping creates time-based partitions using a sliding window approach. This is closely related to the concepts of tumbling windows and sliding windows in time series databases and stream processing systems.
A tumbling window is the default configuration where period == every:
For every = "1mo", period = "1mo", closed = "left":
Window 1: [Jan 1, Feb 1)
Window 2: [Feb 1, Mar 1)
Window 3: [Mar 1, Apr 1)
...
Each row belongs to exactly one window.
A sliding window occurs when period > every:
For every = "1mo", period = "3mo", closed = "left":
Window 1: [Jan 1, Apr 1)
Window 2: [Feb 1, May 1)
Window 3: [Mar 1, Jun 1)
...
Rows may belong to multiple overlapping windows.
The closed parameter controls boundary inclusion:
| closed value | Left boundary | Right boundary | Interval notation |
|---|---|---|---|
"left" (default) |
Inclusive | Exclusive | [start, end) |
"right" |
Exclusive | Inclusive | (start, end] |
"both" |
Inclusive | Inclusive | [start, end] |
"none" |
Exclusive | Exclusive | (start, end) |
The every and period parameters accept Polars duration strings with the following units:
| Suffix | Unit | Example |
|---|---|---|
y |
Year | "1y"
|
mo |
Month | "3mo"
|
w |
Week | "1w"
|
d |
Day | "7d"
|
h |
Hour | "1h"
|
m |
Minute | "30m"
|
s |
Second | "10s"
|
For additional grouping columns (panel data), group_by_dynamic first partitions the data by the grouping columns, then applies the temporal windowing within each partition independently. This enables time-based aggregation of multi-entity datasets (e.g., multiple stock tickers, sensor IDs).