Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Workflow:Astronomer Astronomer cosmos TaskGroup dbt integration

From Leeroopedia



Knowledge Sources
Domains Data_Engineering, dbt, Airflow, Orchestration
Last Updated 2026-02-07 17:00 GMT

Overview

End-to-end process for embedding a dbt project as an Airflow TaskGroup within a larger DAG, enabling composition of dbt transformations with non-dbt tasks.

Description

This workflow covers how to use the DbtTaskGroup class to render a dbt project as a collapsible TaskGroup inside an existing Airflow DAG. Unlike DbtDag which creates a standalone DAG, DbtTaskGroup allows dbt models to coexist with other Airflow operators (e.g., data extraction tasks, notification operators, quality checks) in a single DAG. Multiple DbtTaskGroup instances can be used in the same DAG, each targeting different dbt projects, selections, or configurations. The TaskGroup inherits from both Airflow's TaskGroup and Cosmos's DbtToAirflowConverter.

This pattern supports node selection and exclusion via RenderConfig, allowing fine-grained control over which parts of the dbt project appear in each TaskGroup. Combined with ExecutionConfig, users can mix execution modes (local, virtualenv, Kubernetes) across different TaskGroups within the same DAG.

Usage

Execute this workflow when you need to integrate dbt transformations as part of a larger data pipeline that includes non-dbt tasks. Common scenarios include: pre-processing tasks before dbt runs (e.g., data ingestion), post-processing tasks after dbt completes (e.g., sending notifications, triggering downstream systems), or running multiple dbt project subsets in parallel within the same DAG.

Execution Steps

Step 1: Create the parent DAG context

Define a standard Airflow DAG using either the context manager pattern (with DAG(...)) or the @dag decorator. This parent DAG holds the overall schedule, start date, and default arguments. All DbtTaskGroup instances and other Airflow operators will be registered within this DAG context.

Key considerations:

  • The DAG schedule applies to the entire pipeline, not just the dbt portion
  • Default args (retries, retry_delay) can be overridden per TaskGroup
  • The DAG context must be active when DbtTaskGroup is instantiated

Step 2: Configure profile and execution settings

Create ProfileConfig and ExecutionConfig objects. The profile config maps an Airflow connection to a dbt profile (or references a profiles.yml file). The execution config specifies the execution mode and invocation mode. These can be shared across multiple DbtTaskGroup instances or customized per group.

Key considerations:

  • InvocationMode.SUBPROCESS or InvocationMode.DBT_RUNNER can be set on ExecutionConfig
  • Shared configs reduce boilerplate when multiple TaskGroups use the same database connection
  • Each TaskGroup can override settings if needed (e.g., different schemas or targets)

Step 3: Define node selection for each TaskGroup

Use RenderConfig with select and exclude parameters to control which dbt nodes appear in each TaskGroup. Cosmos supports dbt's selection syntax including path selectors (path:seeds/), tag selectors (tag:nightly), config selectors (config.materialized:incremental), and graph operators (+model_name+). enable_mock_profile can be disabled to benefit from partial parsing.

Key considerations:

  • Selection happens at DAG parse time, not runtime
  • Path selectors are relative to the dbt project root
  • Multiple TaskGroups can select different subsets of the same dbt project
  • enable_mock_profile=False enables dbt partial parsing for faster DAG loads

Step 4: Instantiate DbtTaskGroup instances

Create one or more DbtTaskGroup instances inside the DAG context, each with a unique group_id. Pass the ProjectConfig, ProfileConfig, RenderConfig, ExecutionConfig, and operator_args. Each TaskGroup parses its portion of the dbt graph and generates the corresponding Airflow tasks.

Key considerations:

  • Each TaskGroup must have a unique group_id within the DAG
  • ProjectConfig can use dbt_vars to pass different variables to each group
  • operator_args like install_deps are applied to all tasks within the group
  • TaskGroups appear as collapsible groups in the Airflow UI graph view

Step 5: Wire TaskGroups with other operators

Define upstream and downstream dependencies between the DbtTaskGroup instances and other Airflow operators using the >> and << operators. This establishes the execution order for the complete pipeline, ensuring that dbt transformations run at the correct point in the workflow.

Key considerations:

  • DbtTaskGroup can be used directly in dependency chains (pre_task >> dbt_group >> post_task)
  • Multiple TaskGroups can run in parallel if placed in a list (pre_task >> [group_a, group_b] >> post_task)
  • Each TaskGroup's internal dependencies are maintained automatically by Cosmos

Execution Diagram

GitHub URL

Workflow Repository