Principle:Apache Beam Pipeline Event Reporting
| Knowledge Sources | |
|---|---|
| Domains | Data_Processing, Distributed_Systems |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
Abstraction that provides a callback-based communication channel for reporting terminal pipeline execution events (failures, cancellations, completions) from the execution engine to the result handler.
Description
Pipeline Event Reporting is the principle of using a callback interface to decouple the execution engine from the component that tracks and exposes pipeline results. Rather than having the execution engine directly manipulate result state, it invokes typed callbacks for each terminal event class: checked exceptions, unchecked errors, user cancellation, and successful completion. This separation allows the execution engine to remain agnostic to how results are represented and queried, while enabling the result handler to enforce its own state machine (e.g., preventing multiple terminal transitions).
Usage
Apply this principle when designing a runner where the execution engine and result handler are separate components that need to communicate terminal state transitions. It is the appropriate pattern when multiple terminal event types must be distinguished (failure vs. cancellation vs. completion) and when the result handler needs to implement its own concurrency or state management independently of the executor.
Theoretical Basis
The event reporting model defines a set of mutually exclusive terminal events:
# Abstract event reporting model
terminal_events = {
FAILED_EXCEPTION, # Recoverable failure (checked)
FAILED_ERROR, # Unrecoverable failure (unchecked)
CANCELLED, # User-initiated cancellation
COMPLETED # Successful termination
}
# Contract: exactly one terminal event is delivered per pipeline run
def on_terminal_event(receiver, event):
if event.type == FAILED_EXCEPTION:
receiver.failed(event.exception)
elif event.type == FAILED_ERROR:
receiver.failed(event.error)
elif event.type == CANCELLED:
receiver.cancelled()
elif event.type == COMPLETED:
receiver.completed()
The key design constraint is that exactly one terminal event is delivered per pipeline execution, ensuring the result handler transitions to a final state exactly once.