Principle:ArroyoSystems Arroyo Job Lifecycle
Metadata
| Field | Value |
|---|---|
| Page Type | Principle |
| Knowledge Sources | Repo (ArroyoSystems/arroyo), Doc (Arroyo Documentation) |
| Domains | Stream_Processing, Job_Management |
| Last Updated | 2026-02-08 |
Overview
Managing the lifecycle of streaming jobs through REST API endpoints, including job creation, state tracking, error retrieval, checkpoint inspection, and real-time output streaming.
Description
Job lifecycle management provides the operational interface for monitoring and controlling streaming pipeline execution. While the core state machine governs internal job transitions (Compiling, Scheduling, Running, etc.), the job lifecycle API layer exposes this state to external consumers: the web dashboard, CLI tools, and programmatic clients.
Job Creation and Association
Jobs are created as children of pipelines. Each job is assigned a unique identifier, associated with a checkpoint interval, and subject to organization-level resource limits (maximum concurrent running jobs). Preview jobs receive a short time-to-live for automatic cleanup.
State Observability
The API exposes the current state of each job along with available actions. A state-to-action resolution function maps the 14 possible job states (Created, Compiling, Scheduling, Running, Rescaling, CheckpointStopping, Recovering, Restarting, Stopping, Stopped, Finishing, Finished, Failed, Failing) to appropriate UI actions (Start, Stop, Restart) with in-progress indicators.
Error and Checkpoint History
Job execution history is captured through error log messages (with pagination support) and checkpoint records (with per-operator, per-subtask detail including byte counts and timing spans). This history enables debugging of failed jobs and performance analysis of checkpoint overhead.
Real-Time Output Streaming
For preview jobs and pipelines with preview sinks, real-time output data is streamed to clients via Server-Sent Events (SSE). The API server bridges between the controller's gRPC output stream and the HTTP SSE connection, enabling live result inspection in the web console.
Usage
Job lifecycle management is applied in the following scenarios:
- Dashboard monitoring: The web UI polls job state and metrics to display real-time pipeline status.
- Error investigation: Operators retrieve paginated error logs to diagnose pipeline failures.
- Checkpoint analysis: Performance engineers inspect checkpoint timing and size to tune checkpoint intervals.
- Preview pipelines: Developers stream live output while iterating on SQL queries.