Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Dagster io Dagster Dynamic Partitioning

From Leeroopedia


Knowledge Sources
Domains Data_Engineering, Event_Driven
Last Updated 2026-02-10 00:00 GMT

Overview

Strategy for creating partition sets at runtime based on data discovery rather than predefined time windows or static lists.

Description

Dynamic partitioning allows the set of partitions to grow at runtime as new data is discovered. Unlike time-based or static partitions which are predetermined, dynamic partitions are created programmatically -- often by sensors -- when new entities appear (new users, new RSS feed entries, new API endpoints). This is essential for event-driven architectures where the universe of data is not known in advance.

A dynamic partition set starts empty. Partition keys are added explicitly through requests, typically issued from within a sensor evaluation. Once a partition key is registered, it becomes available for materialization by any asset that references the dynamic partition definition. Keys can also be removed when they are no longer relevant.

This approach decouples the definition of what can be processed from the definition of how it is processed. The asset logic remains the same regardless of how many partitions exist; only the registry of partition keys changes over time.

Usage

Use when the partition space is not known at definition time and grows as new data arrives. Common scenarios include:

  • Event-driven pipelines -- new podcast episodes discovered via RSS, new social media users, new file uploads.
  • Entity-based partitioning -- each customer, tenant, or project becomes its own partition.
  • API-driven discovery -- a sensor polls an external API and registers new items as partitions.

Dynamic partitioning is not appropriate when the partition space is fully known in advance (use static or time-based partitions instead) or when partition keys change frequently (dynamic partitions are best for append-only registries).

Theoretical Basis

Dynamic partitioning extends the partition model from a closed set to an open set. In the static model, the partition universe P is fixed at definition time:

P = {p1, p2, ..., pN}  # fixed at definition time

In the dynamic model, P is a runtime state machine where new keys are registered through explicit add requests:

P(t=0) = {}
P(t=1) = P(t=0) | {new_keys_from_sensor_eval_1}
P(t=2) = P(t=1) | {new_keys_from_sensor_eval_2}

This follows the observer pattern where data discovery (sensors) triggers partition creation and subsequent materialization. The sensor acts as the observer, polling an external system at regular intervals. When it detects new entities, it issues two coordinated actions:

  1. Register new partition keys -- adds the keys to the dynamic partition set.
  2. Request runs for those keys -- creates RunRequests that trigger materialization of assets for the new partitions.

The cursor mechanism ensures idempotent discovery: each sensor evaluation picks up only entities that appeared since the last evaluation, preventing duplicate partition registrations.

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment