Principle:ArroyoSystems Arroyo Checkpoint Coordination

Checkpoint Coordination

Principle: Coordinating distributed checkpoint completion across all operators and subtasks. The controller tracks which subtasks have completed their checkpoint and determines when the global checkpoint is done.

Theoretical Basis

In a distributed checkpoint protocol, individual operators complete their local checkpoints at different times. A coordinator is needed to determine when the global checkpoint -- spanning all operators and their parallel subtasks -- is fully complete. Only after global completion is the checkpoint considered durable and usable for recovery.

Per-Operator Tracking

Each operator in the dataflow graph may have multiple subtasks (parallel instances). The coordinator maintains a counter for each operator, tracking how many subtasks have reported completion. An operator's checkpoint is complete when all of its subtasks have reported.

Concept	Description
Operator	A logical node in the dataflow graph (e.g., a map, filter, or join)
Subtask	A parallel instance of an operator, processing a partition of the data
Subtask count	The expected number of subtasks per operator (determined by parallelism)
Completion counter	Tracks how many subtasks have reported for each operator

Global Completion Detection

The global checkpoint is complete when every operator has all of its subtasks reported. This is a two-level aggregation:

Subtask level: Each subtask independently completes its local checkpoint and reports to the coordinator.
Operator level: When all subtasks of an operator have reported, the operator's checkpoint is complete.
Global level: When all operators are complete, the global checkpoint is done.

Metadata Aggregation

As subtasks report completion, they provide per-subtask metadata (file references, byte counts, watermarks). The coordinator aggregates this metadata into a global checkpoint record that captures the complete state of the system at that epoch.

Epoch Management

The coordinator also manages epoch lifecycle:

Current epoch: The epoch currently being checkpointed.
Minimum epoch (min_epoch): The oldest epoch whose state must be retained. Epochs older than min_epoch can be garbage-collected from object storage.
Epoch advancement: After global checkpoint completion, the current epoch is advanced and min_epoch may be updated.

Domains

Stream_Processing: Checkpoint coordination is essential for consistent distributed snapshots in streaming systems.
Fault_Tolerance: Only globally-complete checkpoints can be used for recovery.
Distributed_Systems: Coordinating completion across distributed participants is a fundamental distributed systems problem.

Related Implementation

Implementation:ArroyoSystems_Arroyo_Checkpoint_State

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment