Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Dagster io Dagster Declarative Automation

From Leeroopedia


Knowledge Sources
Domains Data_Engineering, Scheduling
Last Updated 2026-02-10 00:00 GMT

Overview

Strategy for automating asset materialization based on declarative conditions rather than imperative scheduling logic.

Description

Declarative automation allows assets to specify the conditions under which they should be materialized, rather than requiring explicit schedules or imperative triggers. Instead of writing scheduling code that says "run this job at 2 AM every Monday," the developer declares "this asset should be up-to-date whenever its dependencies have been updated after each Monday cron tick."

Conditions include:

  • Cron-based (on_cron) -- materialize after all dependencies have updated since the last cron tick.
  • Eager (eager) -- materialize whenever any dependency updates.
  • On missing (on_missing) -- materialize new partitions that appear after the condition is applied.
  • Custom composites -- combine atomic conditions with boolean operators (&, |, ~).

The Dagster daemon continuously evaluates these conditions and triggers materializations when they are satisfied. This eliminates the need for manually wiring schedules to jobs and ensures assets stay fresh according to their declared policies.

Usage

Use when assets should be automatically materialized based on upstream changes or time-based triggers. Declarative automation is preferred over imperative schedules when the goal is "keep this asset up-to-date" rather than "run this job at X time."

It is especially valuable in complex dependency graphs where multiple assets have different freshness requirements. Rather than orchestrating a single monolithic job with carefully ordered steps, each asset declares its own automation condition and the system resolves the correct execution order.

Theoretical Basis

Declarative automation inverts the scheduling model from imperative (tell the system when to run) to declarative (tell the system the desired state, let it figure out when to run). This is analogous to declarative infrastructure (Terraform, Kubernetes) applied to data pipelines.

Conditions form a boolean algebra where complex automation policies are composed from atomic predicates:

# Atomic predicates
deps_updated = AutomationCondition.any_deps_updated()
cron_tick = AutomationCondition.cron_tick_passed(schedule)
not_in_progress = ~AutomationCondition.in_progress()

# Composite condition
policy = deps_updated & cron_tick & not_in_progress

The evaluation loop runs at regular intervals within the Dagster daemon:

  1. For each asset with an automation condition, evaluate the condition against current state.
  2. If the condition evaluates to True for any subset of partitions, emit materialization requests for those partitions.
  3. Record the evaluation result to prevent duplicate triggers.

This model achieves eventual consistency for the data graph: given that upstream assets are materialized and conditions are met, downstream assets will eventually be brought up-to-date without manual intervention.

Related Pages

Implemented By

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment