Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Dagster io Dagster Component Based Definitions

From Leeroopedia


Attribute Value
Title Component Based Definitions
Category Principle
Domains Data_Engineering, Project_Structure
Repository Dagster_io_Dagster

Overview

Strategy for organizing Dagster projects using folder-based component discovery with a unified entry point that auto-loads assets, resources, and components from a directory structure.

Description

Component-based definitions replace manual collection of assets, resources, and schedules with automatic discovery from a project directory tree. The @definitions decorator on a Definitions factory function, combined with load_from_defs_folder(), scans the defs/ directory for Python modules and YAML-configured components. This enables a convention-over-configuration approach where adding a new file or component YAML automatically registers it with Dagster.

The system operates through several mechanisms:

  • Entry point decorator: The @dg.definitions decorator marks a function as the entry point for loading Dagster definitions. It wraps the function in a LazyDefinitions object that defers execution until Dagster's loading mechanisms invoke it.
  • Folder scanning: load_from_defs_folder() recursively scans the defs/ directory, discovering Python modules containing @asset, @op, and other decorated functions.
  • YAML component loading: Subdirectories containing defs.yaml files are loaded as components (e.g., DbtProjectComponent) with their YAML-specified configuration.
  • Definition merging: Definitions.merge() combines auto-discovered definitions with explicitly provided resources and other configuration.

Usage

Use when projects grow beyond a single file and need organized structure. The component system is the recommended approach for new Dagster projects, replacing manual Definitions() construction.

Benefits:

  • Reduced boilerplate: No need to manually import and list every asset, schedule, or sensor.
  • Consistent structure: All projects follow the same directory layout, making it easier for teams to navigate unfamiliar projects.
  • Declarative components: YAML-configured components (like DbtProjectComponent) provide integration without Python code.
  • Incremental adoption: The Definitions.merge() pattern allows mixing auto-discovered definitions with manually constructed ones.

Theoretical Basis

Component-based definitions implement the service locator pattern with convention-over-configuration. Instead of explicit registration (Definitions(assets=[a, b, c])), the framework discovers definitions by scanning a directory tree. YAML-configured components (like DbtProjectComponent) provide declarative integration without Python code. This reduces boilerplate and enforces consistent project structure.

Key design principles:

  • Lazy loading: The @definitions decorator wraps the factory function in a LazyDefinitions object, deferring execution until Dagster explicitly requests definitions. This prevents side effects during module import.
  • Composability: Definitions.merge() allows combining auto-discovered definitions with explicit ones, supporting incremental adoption and hybrid approaches.
  • Context propagation: The optional ComponentLoadContext parameter enables environment-specific logic (e.g., different resource configurations for dev vs. production) without changing the project structure.
  • Plugin extensibility: Third-party libraries (like dagster-dbt) can register component types that are discoverable via YAML type fields, extending the framework without modifying core code.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment