Principle:Dagster io Dagster Component Based Definitions
| Attribute | Value |
|---|---|
| Title | Component Based Definitions |
| Category | Principle |
| Domains | Data_Engineering, Project_Structure |
| Repository | Dagster_io_Dagster |
Overview
Strategy for organizing Dagster projects using folder-based component discovery with a unified entry point that auto-loads assets, resources, and components from a directory structure.
Description
Component-based definitions replace manual collection of assets, resources, and schedules with automatic discovery from a project directory tree. The @definitions decorator on a Definitions factory function, combined with load_from_defs_folder(), scans the defs/ directory for Python modules and YAML-configured components. This enables a convention-over-configuration approach where adding a new file or component YAML automatically registers it with Dagster.
The system operates through several mechanisms:
- Entry point decorator: The
@dg.definitionsdecorator marks a function as the entry point for loading Dagster definitions. It wraps the function in aLazyDefinitionsobject that defers execution until Dagster's loading mechanisms invoke it. - Folder scanning:
load_from_defs_folder()recursively scans thedefs/directory, discovering Python modules containing@asset,@op, and other decorated functions. - YAML component loading: Subdirectories containing
defs.yamlfiles are loaded as components (e.g.,DbtProjectComponent) with their YAML-specified configuration. - Definition merging:
Definitions.merge()combines auto-discovered definitions with explicitly provided resources and other configuration.
Usage
Use when projects grow beyond a single file and need organized structure. The component system is the recommended approach for new Dagster projects, replacing manual Definitions() construction.
Benefits:
- Reduced boilerplate: No need to manually import and list every asset, schedule, or sensor.
- Consistent structure: All projects follow the same directory layout, making it easier for teams to navigate unfamiliar projects.
- Declarative components: YAML-configured components (like
DbtProjectComponent) provide integration without Python code. - Incremental adoption: The
Definitions.merge()pattern allows mixing auto-discovered definitions with manually constructed ones.
Theoretical Basis
Component-based definitions implement the service locator pattern with convention-over-configuration. Instead of explicit registration (Definitions(assets=[a, b, c])), the framework discovers definitions by scanning a directory tree. YAML-configured components (like DbtProjectComponent) provide declarative integration without Python code. This reduces boilerplate and enforces consistent project structure.
Key design principles:
- Lazy loading: The
@definitionsdecorator wraps the factory function in aLazyDefinitionsobject, deferring execution until Dagster explicitly requests definitions. This prevents side effects during module import. - Composability:
Definitions.merge()allows combining auto-discovered definitions with explicit ones, supporting incremental adoption and hybrid approaches. - Context propagation: The optional
ComponentLoadContextparameter enables environment-specific logic (e.g., different resource configurations for dev vs. production) without changing the project structure. - Plugin extensibility: Third-party libraries (like dagster-dbt) can register component types that are discoverable via YAML
typefields, extending the framework without modifying core code.