Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:DataTalksClub Data engineering zoomcamp Dbt Project Yml Config

From Leeroopedia


Page Metadata
Knowledge Sources repo: DataTalksClub/data-engineering-zoomcamp, dbt docs: dbt_project.yml reference
Domains Analytics Engineering, dbt Configuration, Project Setup
Last Updated 2026-02-09 14:00 GMT

Overview

Concrete configuration pattern for defining a dbt analytics transformation project using the dbt_project.yml file, establishing project identity, directory layout, materialization defaults, and development variables.

Description

The dbt_project.yml file in the taxi_rides_ny project serves as the single source of truth for how dbt discovers, compiles, and materializes the entire transformation layer. It declares:

  • Project identity: The project is named taxi_rides_ny at version 1.0.0, requiring dbt-core versions >=1.7.0 and <2.0.0.
  • Profile binding: The profile key binds this project to a connection profile (also named taxi_rides_ny) defined in the user's profiles.yml.
  • Directory paths: Models, seeds, macros, analyses, tests, and snapshots each have dedicated directories.
  • Materialization hierarchy: Staging models default to view, while intermediate and marts models default to table.
  • Project variables: Development date range variables (dev_start_date, dev_end_date) used by staging models to limit data in dev.

Usage

This configuration pattern is used when:

  • Initializing a new dbt project for NYC taxi trip analytics.
  • Ensuring all team members run the same dbt version range.
  • Applying layer-specific materialization without per-model config blocks.
  • Providing default variable values that staging models reference for dev environment filtering.

Code Reference

Source Location

04-analytics-engineering/taxi_rides_ny/dbt_project.yml (Lines 1-37)

Signature

name: 'taxi_rides_ny'
version: '1.0.0'

# Require a specific dbt version for reproducibility
require-dbt-version: [">=1.7.0", "<2.0.0"]

# This setting configures which "profile" dbt uses for this project.
profile: 'taxi_rides_ny'

# These configurations specify where dbt should look for different types of files.
model-paths: ["models"]
analysis-paths: ["analyses"]
test-paths: ["tests"]
seed-paths: ["seeds"]
macro-paths: ["macros"]
snapshot-paths: ["snapshots"]

clean-targets:
  - "target"
  - "dbt_packages"

# Project-level variables
vars:
  # Date range for dev environment sampling
  dev_start_date: '2019-01-01'
  dev_end_date: '2019-02-01'

# Configuring models
models:
  taxi_rides_ny:
    staging:
      +materialized: view
    intermediate:
      +materialized: table
    marts:
      +materialized: table

Import

No import is needed. This file is automatically read by dbt at project root. External dependencies are managed separately through packages.yml:

# packages.yml
packages:
  - package: dbt-labs/dbt_utils
    version: [">=1.3.0", "<2.0.0"]
  - package: dbt-labs/codegen
    version: [">=0.14.0", "<1.0.0"]

Install dependencies with:

dbt deps

I/O Contract

Inputs

Input Type Description
profiles.yml YAML config file Connection profile named taxi_rides_ny with adapter-specific credentials (BigQuery or DuckDB)
packages.yml YAML config file External package declarations (dbt_utils, codegen)
models/ directory SQL/YAML files Model definitions discovered by dbt based on model-paths
seeds/ directory CSV files Seed data files (payment_type_lookup.csv, taxi_zone_lookup.csv)
macros/ directory SQL/Jinja files Reusable macro definitions (safe_cast, get_trip_duration_minutes, get_vendor_data)

Outputs

Output Type Description
Compiled project graph DAG Directed acyclic graph of all models, seeds, and tests with resolved materializations
Staging layer (views) Database views Models in models/staging/ materialized as views
Intermediate layer (tables) Database tables Models in models/intermediate/ materialized as tables
Marts layer (tables) Database tables Models in models/marts/ materialized as tables
Variable defaults Runtime values dev_start_date='2019-01-01', dev_end_date='2019-02-01' available via var()

Usage Examples

Referencing project variables in a staging model

select * from renamed

-- Sample records for dev environment using deterministic date filter
{% if target.name == 'dev' %}
where pickup_datetime >= '{{ var("dev_start_date") }}' and pickup_datetime < '{{ var("dev_end_date") }}'
{% endif %}

Overriding variables from the CLI

# Override the dev date range at runtime
dbt run --vars '{"dev_start_date": "2020-01-01", "dev_end_date": "2020-07-01"}'

Overriding materialization at the model level

-- In a specific model file (e.g., fct_trips.sql):
{{
  config(
    materialized='incremental',
    unique_key='trip_id',
    incremental_strategy='merge',
    on_schema_change='append_new_columns'
  )
}}

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment