Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:ArroyoSystems Arroyo State Serialization

From Leeroopedia


Template:Principle

State Serialization

Principle: Serializing operator state to durable storage during checkpoints. Each operator's state tables (time-keyed maps, global keyed maps, etc.) are flushed to Parquet files and uploaded to object storage.

Theoretical Basis

State serialization for checkpointing is the mechanism by which an operator's in-memory state is durably persisted so that it can be recovered after failure. The design must address several concerns to be both correct and efficient.

Columnar Format (Parquet)

Arroyo uses Apache Parquet as the serialization format for state data. Parquet provides:

  • Columnar storage: Efficient compression and encoding for typed data, reducing storage costs.
  • Schema evolution: Forward compatibility as state schemas change across versions.
  • Efficient reads: Columnar reads enable selective restoration of specific columns or row groups during recovery.
  • Ecosystem compatibility: Parquet is widely supported by analytics tools, enabling external inspection of checkpoint data.

Incremental Checkpointing

Rather than writing the entire state on every checkpoint, the system supports incremental checkpointing:

  • Only changed state since the last checkpoint is written.
  • Each checkpoint references both its own new data files and previous checkpoint data files that remain valid.
  • This dramatically reduces checkpoint I/O for operators with large state and small update rates.

Hierarchical Metadata

Checkpoint data is organized hierarchically:

  • Global checkpoint record: Top-level metadata for the entire checkpoint (epoch, timestamp, completion status).
  • Per-operator metadata: References to data files for each operator and subtask, encoded as OperatorCheckpointMetadata.
  • Data files: Parquet files containing the actual serialized state.

Object Storage Backend

State files are written to an object storage backend (S3, GCS, or local filesystem), providing:

  • Durability: Object storage provides high durability guarantees.
  • Scalability: No shared-nothing coordination required between workers writing state.
  • Abstraction: A unified storage interface (BackingStore) abstracts the specific storage backend.

State Table Types

Operators may maintain multiple types of state tables, each serialized independently:

  • Time-keyed maps: State partitioned by event time windows.
  • Global keyed maps: Key-value state shared across time boundaries.
  • Expiring state: State with TTLs that are naturally garbage-collected.

Domains

  • Stream_Processing: State serialization enables stateful operators to survive failures.
  • Fault_Tolerance: Durable state persistence is the foundation of checkpoint-based recovery.
  • State_Management: Efficient serialization and storage of operator state is a core concern.

Related Implementation

Implementation:ArroyoSystems_Arroyo_Table_Manager_Checkpoint

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment