Workflow:PrefectHQ Prefect Dbt Model Orchestration
| Knowledge Sources | |
|---|---|
| Domains | Data_Engineering, Analytics, dbt |
| Last Updated | 2026-02-09 22:00 GMT |
Overview
End-to-end process for orchestrating a complete dbt project lifecycle (deps, seed, run, test) with Prefect, using the prefect-dbt integration for enhanced logging, failure handling, and automatic event emission.
Description
This workflow wraps the standard dbt Core CLI lifecycle in Prefect tasks to add automatic retries, structured logging, and full observability. It downloads a dbt project, creates the necessary database connection profile, and then runs the dbt commands in sequence: install dependencies, load seed data, execute model transformations, and run data tests. The prefect-dbt integration (PrefectDbtRunner) provides native dbt execution with enhanced log-level mapping and automatic Prefect event emission for each dbt node status change.
Key outputs:
- Materialised dbt models in the target database (e.g., DuckDB, Snowflake)
- dbt test results confirming data quality
- Full execution trace of every dbt node in the Prefect UI
Scope:
- From a dbt project source (local or remote) through the complete dbt lifecycle
- Handles project setup, profiling, and all standard dbt commands
Usage
Execute this workflow when you need to run dbt transformations as part of a scheduled data pipeline and want enterprise-grade observability, automatic retry handling, and event-driven monitoring of dbt node execution. It is suitable for both local development with DuckDB and production deployments against cloud data warehouses.
Execution Steps
Step 1: Download and Cache dbt Project
Obtain the dbt project source files. This may involve downloading a ZIP archive from a remote repository, cloning a Git repository, or referencing a local directory. The project is cached locally to speed up subsequent runs.
Key considerations:
- Supports fully self-contained execution without requiring Git
- Caches the project directory to avoid redundant downloads
- Task retries handle transient network failures during download
Step 2: Create Database Connection Profile
Generate the profiles.yml configuration file that dbt needs to connect to the target database. This step writes the connection parameters (database type, path, thread count) to the project directory.
Key considerations:
- Keeps the workflow self-contained by generating profiles in-place
- Supports different targets (DuckDB for local, Snowflake for production)
- Overwrites existing profiles to ensure correct formatting
Step 3: Install dbt Dependencies
Run dbt deps to download any package dependencies declared in the project. This ensures all macros and models from external packages are available before execution.
Key considerations:
- Safe to run even when no external packages are declared
- Retries handle transient network failures during package download
Step 4: Load Seed Data
Run dbt seed to load CSV seed files into the target database as tables. Seeds provide static reference data that models can join against.
Key considerations:
- Safe to run even when no seed files exist
- Creates or replaces tables in the target schema
Step 5: Execute dbt Models
Run dbt run to execute all model transformations defined in the project. This materialises views or tables in the target database according to the model SQL and configuration.
Key considerations:
- Each dbt node execution emits a Prefect event for monitoring
- Failed models trigger task retries before the step is marked as failed
- The prefect-dbt runner provides enhanced log-level mapping
Step 6: Run dbt Tests
Run dbt test to execute all data quality tests declared in the project. Tests validate schema constraints, referential integrity, and custom assertions.
Key considerations:
- Test failures are surfaced clearly in Prefect logs
- Provides confidence that materialised models meet quality expectations