Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Astronomer Astronomer cosmos DAG Rendering

From Leeroopedia


Overview

An orchestration pattern for rendering an entire dbt project as a standalone Airflow DAG with automatic task generation and dependency wiring. This is the "one DAG = one dbt project" integration pattern provided by the astronomer-cosmos library.

Description

The DAG rendering principle converts a dbt project's dependency graph into a native Airflow DAG. The conversion follows a deterministic process:

  1. Graph discovery -- the dbt project is parsed (via dbt ls, manifest JSON, or another loading strategy) to identify all nodes and their inter-dependencies.
  2. Node-to-operator mapping -- each dbt node (model, test, seed, snapshot) is mapped to an Airflow operator whose type is determined by the selected execution mode (local, Docker, Kubernetes, virtual environment, etc.).
  3. Dependency wiring -- the upstream/downstream relationships defined in the dbt DAG are reproduced as Airflow task dependencies, ensuring correct execution ordering.

The result is a fully self-contained Airflow DAG where:

  • Each dbt model becomes a run task (and optionally a test task).
  • Seeds become seed tasks.
  • Snapshots become snapshot tasks.
  • Dependencies between tasks mirror the ref() and source() relationships in dbt.

This pattern is suitable when the entire dbt project (or a filtered subset) should be represented as a single, dedicated Airflow pipeline with its own schedule, SLA, and monitoring configuration.

Usage

Use the DAG rendering pattern when:

  • You want a dedicated Airflow DAG that represents your entire dbt project.
  • The dbt project is the sole workload in the pipeline -- no non-dbt tasks are needed.
  • You want the simplest integration pattern with minimal boilerplate.
  • You need one schedule controlling the entire dbt execution.

This is the recommended starting point for teams adopting Cosmos. For more complex pipelines that mix dbt with non-dbt tasks, consider the TaskGroup rendering pattern instead.

Theoretical Basis

dbt projects define a directed acyclic graph (DAG) of data transformations. Each node represents a SQL model, test, seed, or snapshot, and edges represent data dependencies declared via ref() and source() macros.

This principle performs a structural mapping from the dbt DAG to the Airflow task DAG:

  • Nodes in the dbt graph map to operators in the Airflow graph.
  • Edges in the dbt graph map to task dependencies (>> relationships) in Airflow.
  • The mapping preserves topological ordering -- a task only executes after all of its upstream dependencies have completed.

By performing this 1:1 mapping, the Airflow DAG inherits the dbt project's execution semantics while gaining orchestration capabilities:

  • Scheduling -- run the dbt project on a cron or dataset-triggered schedule.
  • Retries -- individual model failures can be retried without re-running the entire project.
  • Monitoring -- each model is a visible task in the Airflow UI with its own logs and status.
  • Parallelism -- independent branches of the dbt DAG execute concurrently up to Airflow's parallelism limits.

Related Pages

Implementation:Astronomer_Astronomer_cosmos_DbtDag_Init

Knowledge Sources

Domains

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment