Workflow:Apache Airflow Scheduler Operation and Task Execution
| Knowledge Sources | |
|---|---|
| Domains | Workflow_Orchestration, Task_Scheduling, Distributed_Systems |
| Last Updated | 2026-02-08 19:00 GMT |
Overview
End-to-end process for how the Apache Airflow scheduler discovers DAGs, schedules DAG runs, dispatches tasks to executors, and manages the complete task execution lifecycle through the Task Execution Interface.
Description
This workflow documents the internal operation of the Airflow scheduling and execution pipeline. It covers how the DagFileProcessorManager discovers and parses DAG files, how the SchedulerJob creates DagRun instances based on timetables, how tasks transition through states (scheduled, queued, running, success/failed), how executors (Local, Celery, Kubernetes) dispatch work, and how the Task Execution Interface (TEI) in Airflow 3.x decouples task runtime from the scheduler. It also covers the triggerer component for handling async/deferrable operators.
Usage
Understand this workflow when you need to operate, debug, or extend the Airflow scheduling infrastructure. This is relevant for platform engineers configuring Airflow for production, developers debugging task execution issues, or contributors extending the scheduler, executor, or triggerer components.
Execution Steps
Step 1: DAG File Discovery and Parsing
The DagFileProcessorManager continuously scans the configured dags_folder to discover Python files containing DAG definitions. Each file is processed by a DagFileProcessorProcess worker, which imports the module, extracts DAG objects via the DagBag, and stores serialized DAG structures in the metadata database. The manager coordinates parallel processing, respects .airflowignore exclusions, and tracks file modification times to re-parse changed files.
Key considerations:
- The min_file_process_interval configuration controls how frequently files are re-parsed
- DAG serialization decouples the webserver from needing direct filesystem access
- DagBag validates DAG integrity (no duplicate task IDs, valid cron expressions, no cycles)
- DAG versioning tracks structural changes across parses
Step 2: DagRun Creation and Scheduling
The SchedulerJob examines each active DAG's timetable to determine when the next DagRun should be created. For time-based schedules (cron, interval), it calculates data intervals and creates DagRun records with the appropriate execution_date and logical_date. For asset-triggered DAGs, the scheduler watches for upstream asset events and creates runs when trigger conditions are satisfied. Each DagRun transitions through states: queued, running, success, or failed.
Key considerations:
- Multiple timetable implementations support diverse scheduling patterns (cron, delta, continuous, event-based, workday-aware)
- max_active_runs_per_dag limits concurrent runs of the same DAG
- Backfill operations create DagRuns for historical date ranges, either forward or reverse
- Asset evaluation uses boolean logic (AND/OR) to combine trigger conditions from multiple upstream assets
Step 3: Task Instance State Management
For each DagRun, the scheduler creates TaskInstance records and manages their state transitions. Tasks start as none, move to scheduled when all upstream dependencies are met, then to queued when sent to an executor. The executor reports back running when execution begins, and finally success or failed upon completion. The scheduler handles retries, upstream failures, and dependency resolution including trigger rules (all_success, one_success, all_done, etc.).
Key considerations:
- Task state transitions are persisted in the metadata database for durability
- Pool slots limit concurrent execution of tasks sharing constrained resources
- Priority weight determines task ordering within the executor queue
- Mapped (dynamic) tasks are expanded at scheduling time based on upstream XCom outputs
Step 4: Executor Dispatch
The scheduler dispatches queued tasks to the configured executor. The LocalExecutor runs tasks as subprocesses on the scheduler machine. The CeleryExecutor distributes tasks to worker nodes via a message broker. The KubernetesExecutor launches each task as an individual Kubernetes pod. The ExecutorLoader dynamically loads the appropriate executor class based on configuration.
Key considerations:
- The executor interface (BaseExecutor) provides a standard API for task lifecycle management
- Worker health is monitored through heartbeat mechanisms
- The LocalExecutor manages a pool of worker processes with configurable parallelism
- Multiple executor types can be configured simultaneously in Airflow 3.x
Step 5: Task Execution via Task Execution Interface
In Airflow 3.x, tasks execute through the Task Execution Interface (TEI), which decouples the task runtime from the scheduler. The Task SDK provides a lightweight environment for task execution, separate from airflow-core. Tasks communicate with the API server rather than directly accessing the metadata database. This architecture enables multi-language task execution (Python Task SDK, Go Task SDK) and improved isolation.
Key considerations:
- The Task SDK is a separate package (apache-airflow-task-sdk) with minimal dependencies
- Tasks communicate via the TEI protocol, not direct database access
- The Go SDK implements the TEI for running tasks natively in Go
- Serialization/deserialization of task parameters is handled by the SDK layer
Step 6: Trigger and Deferrable Operator Handling
The TriggererJob manages asynchronous trigger instances for deferrable operators. When a task defers, it creates a Trigger record and the task enters the deferred state. The triggerer runs trigger classes as async coroutines, monitoring for trigger events. When a trigger fires, the triggerer creates a TriggerEvent, and the scheduler resumes the deferred task from where it left off.
Key considerations:
- Deferrable operators free up executor slots while waiting for external conditions
- Triggers run as async Python coroutines within the triggerer process
- The triggerer supports health monitoring and graceful shutdown
- Multiple triggerer instances can run for high availability
Step 7: Callback Execution and Event Notification
Upon task and DAG state transitions, Airflow executes configured callbacks and notifies registered listeners. Callbacks (on_success_callback, on_failure_callback, on_retry_callback) run in the context of the worker. The listener framework (pluggy-based) provides hooks for component lifecycle events, task instance state changes, DAG run transitions, and asset events, enabling observability integrations and custom automation.
Key considerations:
- Listeners implement hook specifications defined in the shared listeners package
- The ListenerManager coordinates notification delivery to all registered listeners
- Slow or throwing listeners are handled gracefully to avoid blocking the scheduling loop
- OpenTelemetry traces span the full execution lifecycle from scheduling to completion