Principle:Dagster io Dagster Project Scaffolding
| Property | Value |
|---|---|
| Type | Principle |
| Category | Data_Engineering, Developer_Tools |
| Repository | Dagster_io_Dagster |
| Related Implementation | Implementation:Dagster_io_Dagster_Create_Dagster_CLI |
Overview
Pattern for bootstrapping new Dagster projects with standard directory structure, configuration, and development tooling through CLI-driven code generation.
Description
Project scaffolding automates the creation of new Dagster projects with a standardized structure. The create-dagster CLI generates a complete project skeleton including:
definitions.py-- The entry point for Dagster definitionsdefs/directory -- For assets, components, and auto-discovered definitionspyproject.toml-- Python project configuration with Dagster-specific settings- Virtual environment setup -- Isolated Python environment for the project
The dg CLI provides additional scaffolding for components (dg scaffold defs) and development server startup (dg dev). Together, these tools ensure every new Dagster project starts with a consistent, well-organized structure.
Usage
Use at the start of any new Dagster project. The scaffolding ensures:
- Consistent project structure across teams and organizations
- All necessary configuration for development, testing, and deployment
- Auto-discovery of definitions through the
defs/directory convention - Ready-to-run development environment with virtual environment and dependencies
This pattern is appropriate for greenfield Dagster projects, team onboarding, and establishing organizational standards for data pipeline projects.
Theoretical Basis
Project scaffolding implements the template method pattern at the project level. A standard project structure (directory layout, entry point, configuration) is instantiated from a template, with customization points (project name, components) filled in by the user.
Key design principles at work:
- Convention over configuration -- The scaffolded structure establishes conventions (e.g.,
defs/for auto-discovered definitions,definitions.pyas the entry point) that reduce the number of decisions developers must make at project start. - Separation of concerns -- The generated structure separates source code (
src/), configuration (pyproject.toml), and environment (.venv/) into distinct directories. - Progressive disclosure -- The minimal scaffolded project works immediately (
dg devstarts a server), while additional complexity (components, custom resources) can be added incrementally viadg scaffold. - Auto-discovery -- The
defs/directory convention enables Dagster to automatically find and load definitions without explicit registration, following the principle of least surprise.