Principle:TobikoData Sqlmesh Interval Configuration
| Knowledge Sources | |
|---|---|
| Domains | Data_Engineering, Incremental_Processing |
| Last Updated | 2026-02-07 00:00 GMT |
Overview
Configure time-based interval processing parameters including scheduling cadence, batch sizes, and lookback windows for incremental data transformations.
Description
Interval configuration defines how time is partitioned and scheduled for incremental data processing. This includes specifying the cron schedule that determines processing frequency, the batch size that controls how many intervals are processed in a single execution, and lookback windows that allow models to access historical data for context-dependent calculations.
Proper interval configuration ensures that data processing aligns with business requirements, data arrival patterns, and computational resources. The configuration controls both the logical partitioning of data (how time intervals are defined) and the physical execution strategy (how many intervals to process concurrently or sequentially).
The system uses cron expressions to define regular processing schedules while supporting on-demand execution for ad-hoc analysis or backfill operations. Batch sizing allows fine-tuning of parallelism versus resource utilization trade-offs.
Usage
Use interval configuration when setting up incremental models to match data arrival patterns and processing requirements. Configure daily batches for end-of-day aggregations, hourly intervals for near-real-time dashboards, or custom cron schedules for business-specific timing requirements.
Apply lookback windows when calculations require historical context, such as computing week-over-week growth rates, rolling averages, or detecting anomalies based on historical patterns. Set batch sizes to optimize parallelism based on available compute resources and data volume per interval.
Adjust batch concurrency to control how many intervals can be processed simultaneously, balancing throughput against resource constraints and database connection limits.
Theoretical Basis
Interval configuration operates on several interconnected concepts:
CONFIGURATION:
cron_schedule: periodic interval generation pattern
batch_size: number of intervals per execution unit
batch_concurrency: parallel execution limit
lookback: number of prior intervals accessible
INTERVAL GENERATION:
base_intervals = generate_from_cron(cron_schedule, start, end)
IF batch_size specified THEN
execution_batches = group_consecutive(base_intervals, batch_size)
ELSE
execution_batches = base_intervals
EXECUTION PLANNING:
available_intervals = filter_missing(execution_batches, completion_state)
scheduled_batches = prioritize_topologically(available_intervals)
FOR each batch in scheduled_batches:
IF dependencies_met(batch) THEN
IF lookback > 0 THEN
extended_range = expand_start(batch, lookback * interval_duration)
input_data = read_data(extended_range)
output_interval = batch.original_range
ELSE
input_data = read_data(batch)
output_interval = batch
execute_with_concurrency_limit(batch, batch_concurrency)
The cron schedule defines the "natural" interval boundaries based on business requirements. For example, a daily cron at midnight creates intervals [00:00, 24:00), while an hourly cron creates 24 distinct intervals per day.
Batch sizing is a performance optimization that does not change interval boundaries but affects execution granularity. A batch_size of 7 with daily intervals processes one week at a time, reducing overhead while maintaining daily partition granularity.
Lookback extends the read window without changing the write window, enabling calculations that depend on historical data:
Example: 7-day moving average with lookback=6
Processing interval: 2024-01-08
Read window: 2024-01-02 to 2024-01-08 (7 days total)
Write window: 2024-01-08 only