Principle:Apache Airflow DagRun Creation
| Knowledge Sources | |
|---|---|
| Domains | Scheduling, Workflow_Orchestration |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
The process by which the scheduler evaluates DAG timetables and creates DagRun instances to trigger task execution.
Description
DagRun Creation is the core scheduling decision process. The SchedulerJobRunner continuously evaluates each active DAG's timetable to determine if a new run should be created. A DagRun represents a single invocation of a DAG at a specific logical date, with an associated data interval. The scheduler respects constraints like max_active_runs, catchup settings, and pool availability when creating runs. DagRuns can also be created manually (triggered) or as backfills.
Usage
This principle governs all automated scheduling in Airflow. Understanding DagRun creation is essential for debugging scheduling issues, managing backfills, and tuning scheduler performance.
Theoretical Basis
Scheduling Decision:
# Pseudo-code for scheduler main loop
def scheduler_loop():
for dag in active_dags:
if dag.active_runs < dag.max_active_runs:
info = dag.timetable.next_dagrun_info(
last_automated_data_interval=dag.last_run.data_interval,
restriction=TimeRestriction(dag.start_date, dag.end_date, dag.catchup),
)
if info and now() >= info.run_after:
create_dagrun(dag, info)
DagRun States: queued → running → success/failed
Run Types:
- scheduled: Created by the scheduler based on timetable
- manual: Triggered by user via API/UI
- backfill: Created for historical date ranges