Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Astronomer Astronomer cosmos Render Configuration

From Leeroopedia


Metadata

Field Value
Page Type Principle
Repository astronomer-cosmos
Domains Data_Engineering, Configuration, DAG_Rendering
Related Implementation Implementation:Astronomer_Astronomer_cosmos_RenderConfig_Init
Knowledge Sources dbt Node Selection, astronomer-cosmos

Overview

Render Configuration is a configuration principle for controlling how a dbt project graph is parsed, filtered, and rendered into orchestration tasks. It governs the transformation from a dbt project's dependency graph (nodes and edges) into an orchestration DAG's task structure (operators and dependencies).

This principle addresses the decisions that must be made when converting a dbt project into orchestration tasks: which nodes to include or exclude, how tests are grouped relative to their parent models, which parsing method to use for discovering the project graph, and how sources and datasets are represented.

Description

A dbt project defines a directed acyclic graph (DAG) of nodes: models, tests, seeds, snapshots, and sources. When an orchestration system renders this graph into executable tasks, several configuration decisions shape the resulting task topology:

Node Selection and Filtering

Not every node in a dbt project should necessarily become an orchestration task. The render configuration provides mechanisms for:

  • Selection (select): Specifying which nodes to include using dbt's node selection syntax. Supports path selectors (path:models/staging), tag selectors (tag:daily), config selectors (config.materialized:table), and graph operators (+model_name, model_name+).
  • Exclusion (exclude): Specifying which nodes to exclude from the selected set, using the same selector syntax.
  • Named Selectors (selector): Referencing pre-defined selector definitions from selectors.yml.

Test Behavior

dbt tests can be organized relative to their parent models in several ways:

  • After Each: Each test runs immediately after the model it tests. This provides the fastest feedback but creates more tasks.
  • After All: All tests run after all models have completed. This reduces task count but delays test feedback.
  • None: Tests are omitted entirely from the orchestration DAG. Useful when tests are run separately (e.g., in CI).
  • Build: Tests are incorporated into the dbt build command alongside their parent model.

Load Method (Parsing Strategy)

The method used to discover the dbt project graph determines parsing performance and requirements:

  • Automatic: The system selects the best available method based on the environment.
  • dbt ls: Invokes dbt ls to list project nodes. Requires dbt to be installed and accessible.
  • Manifest: Reads a pre-compiled manifest.json file. Fast but requires the manifest to be available.
  • Custom: User-defined parsing logic for specialized use cases.

Source and Dataset Rendering

Controls how dbt sources are represented in the orchestration DAG:

  • Whether to emit Airflow datasets for cross-DAG dependency management.
  • How source nodes are rendered (as upstream dependencies, as standalone tasks, or omitted).
  • Source pruning behavior for removing unnecessary source nodes from the rendered graph.

Usage

Use render configuration when you need fine-grained control over which dbt nodes appear in the orchestration DAG and how they are organized. Common scenarios include:

  • Partial Project Rendering: When a single dbt project serves multiple Airflow DAGs, each DAG renders only a subset of the project using select/exclude filters.
  • Test Optimization: When test execution strategy must be tuned for the specific deployment — fast feedback in development (after each) versus reduced overhead in production (after all or none).
  • Performance Tuning: When DAG parsing performance is critical, choosing the appropriate load method (manifest for speed, dbt ls for accuracy) optimizes the render phase.
  • Multi-DAG Coordination: When multiple Airflow DAGs share dbt models, dataset emission enables automatic cross-DAG triggering based on data freshness.
  • Resource Optimization: When the full dbt project graph creates too many Airflow tasks, selective rendering keeps the DAG manageable.

Theoretical Basis

dbt's node selection syntax provides the filtering language that underpins render configuration. The syntax supports several selector types:

Selector Syntax Example Description
Path path: path:models/staging Select nodes by filesystem path
Tag tag: tag:daily Select nodes by tag
Config config. config.materialized:table Select nodes by config value
Source source: source:raw.orders Select source nodes
Graph (upstream) + +model_name Include upstream dependencies
Graph (downstream) + model_name+ Include downstream dependents
Graph (n-depth) +n 2+model_name Include n levels of ancestors
Intersection space tag:daily path:staging Nodes matching ALL selectors
Union comma tag:daily,tag:hourly Nodes matching ANY selector

Test behavior determines the task topology of the rendered DAG. The choice between "after each" and "after all" represents a fundamental tradeoff:

  • After each creates a topology where test tasks are interleaved with model tasks, providing fine-grained failure isolation. If a test fails, downstream models are blocked immediately.
  • After all creates a topology where model tasks form one layer and test tasks form a second layer, reducing total task count but delaying failure detection.

The load method determines the parsing strategy, which affects both performance and requirements. Manifest-based parsing is O(1) (read a file), while dbt ls-based parsing is O(n) in project complexity (invoke dbt's parser). The tradeoff is between speed and freshness: manifests may be stale, while dbt ls always reflects the current project state.

Related Pages

Implemented By

Related Principles

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment