Workflow:Dagster io Dagster Dbt Integration
| Knowledge Sources | |
|---|---|
| Domains | Data_Engineering, SQL_Transformations, dbt |
| Last Updated | 2026-02-10 12:00 GMT |
Overview
End-to-end process for integrating a dbt project with Dagster using the component-based approach, covering data ingestion, dbt asset creation, incremental models, testing, and project management.
Description
This workflow demonstrates how to orchestrate a dbt project within Dagster using the modern component system (DbtProjectComponent). It covers ingesting upstream data into a database, configuring dbt as a Dagster component via YAML, automatically converting dbt models into Dagster assets with full lineage, implementing incremental materialization strategies, mapping dbt tests to Dagster asset checks, and managing the combined project structure. The component approach replaces manual Python-based dbt integration with declarative configuration.
Usage
Execute this workflow when you have an existing dbt project that you want to orchestrate with Dagster, or when building a new data transformation layer that benefits from dbt's SQL modeling approach combined with Dagster's asset orchestration. This is the recommended pattern for teams already using dbt who want to add scheduling, observability, and upstream/downstream asset integration.
Execution Steps
Step 1: Data Ingestion
Create Dagster assets that load raw data into the database tables that serve as sources for dbt models. These upstream assets represent the extract layer and produce the tables that dbt's source definitions reference. Each source table maps to a separate Dagster asset.
Key considerations:
- Upstream ingestion assets must produce the tables referenced in dbt's sources.yml
- Dagster automatically establishes lineage between ingestion assets and dbt source nodes
- The ingestion layer can use any data loading approach (API calls, file imports, database copies)
Step 2: dbt Component Configuration
Configure the dbt project as a Dagster component using DbtProjectComponent and YAML configuration. The component scaffolding command generates the necessary configuration files. The YAML-based approach automatically discovers dbt models, sources, and tests and converts them into Dagster assets.
Key considerations:
- Use the dg scaffold defs command to generate the dbt component boilerplate
- The defs.yaml file declares the dbt project path and any configuration overrides
- The component system handles dbt profile and target configuration
- dbt-duckdb adapter enables local development without external database dependencies
Step 3: dbt Asset Definition
Verify that dbt models are correctly represented as Dagster assets with proper dependency edges. Each dbt model (staging, intermediate, mart) becomes a Dagster asset. Dependencies between models (via ref() calls in SQL) translate to asset dependency edges. Upstream Dagster ingestion assets connect to dbt source nodes.
Key considerations:
- dbt model dependencies (ref) automatically create Dagster asset edges
- dbt source definitions connect to upstream Dagster ingestion assets
- Asset metadata includes the SQL query, materialization type, and dbt tags
- The Dagster UI displays the complete lineage graph across ingestion and transformation layers
Step 4: Incremental Model Support
Configure dbt incremental models to process only new or changed data since the last run. Incremental models use dbt's is_incremental() macro to conditionally filter data. In Dagster, these models can be paired with partition definitions for time-based incremental processing.
Key considerations:
- Incremental models reduce processing time for large datasets
- The is_incremental() macro controls whether a full refresh or incremental load occurs
- Dagster's materialization metadata tracks when each model was last refreshed
- Full refresh can be triggered from the Dagster UI when needed
Step 5: dbt Test Integration
Map dbt tests to Dagster asset checks for unified data quality monitoring. dbt schema tests (unique, not_null, accepted_values, relationships) and custom tests are automatically converted to Dagster asset checks. Test results are visible alongside asset materialization status in the Dagster UI.
Key considerations:
- dbt tests automatically become Dagster asset checks through the component integration
- Both schema tests and custom data tests are supported
- Test failures appear as failed asset checks in the Dagster UI
- Tests run as part of the dbt build process during asset materialization
Step 6: Project Management
Organize and maintain the combined Dagster-dbt project structure for team collaboration. This includes managing dbt project files alongside Dagster definitions, coordinating environment configurations, and establishing development workflows that support both dbt-only and Dagster-orchestrated execution modes.
Key considerations:
- The dbt project lives within the Dagster project directory structure
- Environment variables configure database connections for different environments
- Developers can run dbt commands directly or through Dagster materialization
- Version control should track both Dagster definitions and dbt project files together