Principle:Astronomer Astronomer cosmos Async Operator Execution
| Knowledge Sources | |
|---|---|
| Domains | Async_Execution, Deferrable_Operators |
| Last Updated | 2026-02-07 17:00 GMT |
Overview
Deferring long-running data transformation operations to external services so that orchestrator worker slots are freed while the operation completes asynchronously.
Description
What it is: Async Operator Execution is a principle in which dbt model executions are translated into native, deferrable cloud-provider operations (such as BigQuery jobs) rather than being run as blocking subprocess calls on an Airflow worker. The operator compiles the dbt model into SQL, submits that SQL to the target data platform's asynchronous job API, and then defers -- releasing the worker slot entirely. When the remote job completes, an Airflow trigger fires and the operator resumes briefly to record results.
What problem it solves: In the default local execution model, each running dbt model occupies an Airflow worker slot (or Celery worker process) for the full duration of the query -- which can range from seconds to hours for large transformations. This creates a bottleneck: the number of concurrent dbt models is bounded by the worker pool size, even though the workers are merely waiting on I/O from the data warehouse. Async execution decouples submission from completion, allowing a small pool of workers to manage a much larger number of concurrent transformations.
Where it fits: This principle is activated by setting execution_mode = ExecutionMode.AIRFLOW_ASYNC. It currently targets BigQuery via DbtRunAirflowAsyncBigqueryOperator, with the architecture designed to extend to other database backends. It relies on Airflow's deferrable operator framework (Airflow 2.2+) and the corresponding cloud provider packages (e.g., apache-airflow-providers-google). The principle complements -- rather than replaces -- synchronous execution modes; it is specifically beneficial for I/O-bound, long-running model materializations.
Usage
Use Async Operator Execution when:
- dbt models target a cloud data warehouse that exposes an asynchronous job submission API (currently BigQuery).
- Typical model run times are long enough that holding a worker slot is wasteful (minutes to hours).
- The Airflow deployment uses a constrained worker pool (Celery or Kubernetes executor) and you need higher concurrency without adding infrastructure.
- You want rendered SQL to be visible in the Airflow UI as a template field for debugging and auditing.
Avoid this principle when:
- The target database does not have a supported async provider operator.
- Models are fast enough that deferral overhead (trigger polling, XCom round-trips for SQL) exceeds the time saved.
- You require dbt features that depend on the dbt runtime context at execution time (e.g., incremental logic that cannot be fully compiled ahead of execution).
Theoretical Basis
The async execution principle is built on three interlocking mechanisms:
1. Factory pattern for profile-based class resolution. Because each database backend requires a different Airflow provider operator for deferrable execution, Cosmos uses a factory (DbtRunAirflowAsyncFactoryOperator) that inspects the user's ProfileConfig at DAG parse time to determine the target database type (e.g., bigquery). It then dynamically loads the corresponding concrete async operator class from cosmos.operators._asynchronous.{profile_type} using _create_async_operator_class. The factory rewrites its own base classes at runtime (__bases__ reassignment) so that the resulting operator instance is a proper subclass of the resolved provider operator. This avoids static coupling to any single cloud provider while preserving Airflow's operator type introspection.
2. Compile-then-submit separation. The async workflow splits the traditional dbt "compile and execute" into two distinct phases. A setup task (the producer) runs dbt locally to compile all models into raw SQL. The compiled SQL is then transferred to downstream async operator tasks either via XCom (compressed with zlib and base64-encoded to minimise metadata database bloat) or via a remote object store path (remote_target_path). Each async operator task (the consumer) retrieves its SQL, wraps it in the provider's job configuration (e.g., BigQuery's configuration.query), and submits it using the provider's deferrable operator, which sets deferrable=True to leverage Airflow's trigger-based polling.
3. Deferrable operator lifecycle. Once the provider operator submits the job, it raises a TaskDeferred exception, causing the Airflow executor to release the worker slot and register a trigger. The trigger polls the cloud API at a configured interval. When the job completes, the trigger fires and Airflow schedules a brief execute_complete callback on a worker. This callback records the job ID, stores rendered template fields (compiled SQL, project, dataset) for UI visibility, and optionally registers dataset events for Airflow's data-aware scheduling.