Principle:Astronomer Astronomer cosmos TaskGroup Rendering
Overview
An orchestration pattern for embedding a dbt project as a task group within a larger Airflow DAG alongside non-dbt tasks. This enables mixed pipelines where dbt transformations coexist with data extraction, validation, notification, and other arbitrary Airflow operators.
Description
Unlike full DAG rendering (where one dbt project equals one Airflow DAG), TaskGroup rendering embeds dbt tasks as a nested group within an existing DAG. This distinction has several important implications:
- Mixed pipelines -- dbt transformations can be preceded by extraction operators (e.g., an S3 sensor or API pull) and followed by non-dbt operations (e.g., data quality checks, Slack notifications, or downstream system triggers).
- Multiple dbt subgraphs -- several
DbtTaskGroupinstances can coexist within the same DAG, each rendering a different subset of the dbt project (e.g., staging models vs. mart models) or even different dbt projects entirely. - Shared scheduling -- all tasks, dbt and non-dbt alike, share a single DAG schedule and execution context.
- Visual grouping -- in the Airflow UI, dbt tasks are visually collapsed under the task group, keeping the DAG view clean while allowing drill-down into individual dbt tasks.
The rendering mechanics are identical to DbtDag -- graph loading, node-to-operator mapping, and dependency wiring all work the same way. The only difference is that the generated tasks are scoped to a TaskGroup rather than being the top-level tasks of a standalone DAG.
Usage
Use the TaskGroup rendering pattern when:
- dbt is one part of a larger pipeline -- extraction, loading, or post-processing steps surround the dbt transformations.
- Multiple dbt selections are needed within the same DAG -- e.g., running staging models first, then mart models, with a validation step in between.
- Different dbt projects need to be orchestrated together in a single scheduling unit.
- You need to wire dbt outputs to downstream operators using standard Airflow
>>syntax.
For simpler cases where the entire pipeline is dbt-only, prefer the DAG rendering pattern (DbtDag) for reduced boilerplate.
Theoretical Basis
Airflow TaskGroups provide visual and logical grouping of tasks without creating separate DAGs. They are a namespace mechanism -- tasks inside a group have prefixed task IDs (e.g., dbt_task_group.model_customers) but otherwise behave identically to top-level tasks.
By rendering dbt subgraphs as TaskGroups, the orchestration achieves:
- Single scheduling unit -- one DAG schedule controls the entire pipeline, avoiding the complexity of cross-DAG dependencies.
- Clear delineation -- dbt work is visually and logically separated from non-dbt work.
- Composability -- TaskGroups can be wired to upstream and downstream tasks using standard Airflow dependency syntax, treating the entire dbt subgraph as a single unit from the outside.
- Reusability -- the same
DbtTaskGroupconfiguration can be embedded in multiple DAGs.
The internal dependency structure within the TaskGroup still mirrors the dbt project's DAG, preserving topological ordering of model execution.
Related Pages
Implementation:Astronomer_Astronomer_cosmos_DbtTaskGroup_Init