Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Dagster io Dagster Sensor Driven Pipelines

From Leeroopedia


Knowledge Sources
Domains Data_Engineering, Event_Driven
Last Updated 2026-02-10 00:00 GMT

Overview

Strategy for triggering pipeline execution based on external events detected through periodic polling sensors.

Description

Sensors are periodic evaluation functions that poll external systems for changes and trigger pipeline runs in response. They maintain state across evaluations via cursors (e.g., etags, timestamps, sequence numbers) to detect only new events. Sensors can create RunRequests for jobs, add dynamic partitions, and report skip reasons when no action is needed. They integrate with the Dagster daemon for continuous background evaluation.

A sensor function is invoked at regular intervals (controlled by minimum_interval_seconds). On each invocation, it receives a context object containing the cursor from the previous evaluation. The function checks an external system for changes since the cursor, and returns one of:

  • SensorResult -- containing run requests, dynamic partition additions, and an updated cursor.
  • SkipReason -- when no action is needed, providing a human-readable explanation visible in the Dagster UI.

This model enables event-driven architectures within Dagster without requiring external webhook infrastructure.

Usage

Use when pipeline execution should be driven by external events: new files appearing in cloud storage, RSS feed updates, database row insertions, API state changes, or webhook-like triggers. Sensors provide the event-detection mechanism for event-driven architectures.

Sensors are also the primary mechanism for populating dynamic partition sets. When a sensor discovers a new entity, it simultaneously registers the entity as a partition key and requests a run for that partition.

Theoretical Basis

Sensors implement the polling-based observer pattern. The sensor function is the observer, and the external system is the subject. Rather than the subject pushing notifications (as in webhooks), the observer periodically pulls state and computes the diff.

The cursor mechanism is central to correctness. It provides:

  • Exactly-once event detection -- each event is detected in exactly one sensor evaluation, because the cursor advances past processed events.
  • Crash recovery -- the cursor is persisted by the Dagster instance. If the daemon restarts, the sensor resumes from its last committed cursor.
  • Idempotency -- given the same cursor, the sensor produces the same result (assuming the external system is append-only).

The trade-off compared to push-based systems (webhooks) is latency vs. simplicity:

Property Polling (Sensors) Push (Webhooks)
Latency Bounded by polling interval Near real-time
Infrastructure No external setup needed Requires webhook endpoint
Reliability Built-in retry via daemon Requires delivery guarantees
State management Cursor in Dagster storage External state required

In pseudocode, the sensor evaluation loop is:

while daemon_is_running:
    for sensor in registered_sensors:
        if time_since_last_eval(sensor) >= sensor.minimum_interval_seconds:
            cursor = load_cursor(sensor)
            result = sensor.evaluate(cursor)
            if result.has_run_requests:
                submit_runs(result.run_requests)
                register_partitions(result.dynamic_partitions_requests)
            save_cursor(sensor, result.cursor)
    sleep(tick_interval)

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment