Principle:Astronomer Astronomer cosmos DAG Rendering
Overview
An orchestration pattern for rendering an entire dbt project as a standalone Airflow DAG with automatic task generation and dependency wiring. This is the "one DAG = one dbt project" integration pattern provided by the astronomer-cosmos library.
Description
The DAG rendering principle converts a dbt project's dependency graph into a native Airflow DAG. The conversion follows a deterministic process:
- Graph discovery -- the dbt project is parsed (via
dbt ls, manifest JSON, or another loading strategy) to identify all nodes and their inter-dependencies. - Node-to-operator mapping -- each dbt node (model, test, seed, snapshot) is mapped to an Airflow operator whose type is determined by the selected execution mode (local, Docker, Kubernetes, virtual environment, etc.).
- Dependency wiring -- the upstream/downstream relationships defined in the dbt DAG are reproduced as Airflow task dependencies, ensuring correct execution ordering.
The result is a fully self-contained Airflow DAG where:
- Each dbt model becomes a run task (and optionally a test task).
- Seeds become seed tasks.
- Snapshots become snapshot tasks.
- Dependencies between tasks mirror the
ref()andsource()relationships in dbt.
This pattern is suitable when the entire dbt project (or a filtered subset) should be represented as a single, dedicated Airflow pipeline with its own schedule, SLA, and monitoring configuration.
Usage
Use the DAG rendering pattern when:
- You want a dedicated Airflow DAG that represents your entire dbt project.
- The dbt project is the sole workload in the pipeline -- no non-dbt tasks are needed.
- You want the simplest integration pattern with minimal boilerplate.
- You need one schedule controlling the entire dbt execution.
This is the recommended starting point for teams adopting Cosmos. For more complex pipelines that mix dbt with non-dbt tasks, consider the TaskGroup rendering pattern instead.
Theoretical Basis
dbt projects define a directed acyclic graph (DAG) of data transformations. Each node represents a SQL model, test, seed, or snapshot, and edges represent data dependencies declared via ref() and source() macros.
This principle performs a structural mapping from the dbt DAG to the Airflow task DAG:
- Nodes in the dbt graph map to operators in the Airflow graph.
- Edges in the dbt graph map to task dependencies (
>>relationships) in Airflow. - The mapping preserves topological ordering -- a task only executes after all of its upstream dependencies have completed.
By performing this 1:1 mapping, the Airflow DAG inherits the dbt project's execution semantics while gaining orchestration capabilities:
- Scheduling -- run the dbt project on a cron or dataset-triggered schedule.
- Retries -- individual model failures can be retried without re-running the entire project.
- Monitoring -- each model is a visible task in the Airflow UI with its own logs and status.
- Parallelism -- independent branches of the dbt DAG execute concurrently up to Airflow's parallelism limits.
Related Pages
Implementation:Astronomer_Astronomer_cosmos_DbtDag_Init