Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Heuristic:Risingwavelabs Risingwave Memory Cache Eviction Policy

From Leeroopedia




Knowledge Sources
Domains Streaming, Memory_Management, Optimization
Last Updated 2026-02-09 08:00 GMT

Overview

Stream executors (hash join, over-window) use bounded in-memory caches with explicit eviction policies: hash joins evict every 16 rows, over-window caches hold 1024 entries with 128-entry jitter prevention, and batch processing uses 512-row batches.

Description

RisingWave's stateful streaming executors maintain in-memory caches to avoid repeated state store lookups. Each executor type has independently tuned cache parameters based on its access patterns. The hash join executor triggers eviction every 16 rows to prevent unbounded memory growth during long-running joins. The over-window executor uses a range cache of 1024 entries with a jitter prevention buffer of 128 entries (1/8 of cache size) to prevent thrashing when the working set size is near the cache boundary. Over-window operations are batched in groups of 512 rows for efficiency. The emit-on-window-close sort buffer maintains a 2048-entry cache.

Usage

Apply this heuristic when diagnosing memory issues in stateful stream operators, investigating hash join performance, or tuning over-window queries with large partitions. These cache sizes directly affect the trade-off between memory usage and state store I/O.

The Insight (Rule of Thumb)

  • Hash Join Eviction: Evicts stale cache entries every 16 rows received (EVICT_EVERY_N_ROWS = 16).
  • Over-Window Cache: Maintains 1024 entries (MAGIC_CACHE_SIZE = 1024) with 128-entry jitter buffer (MAGIC_JITTER_PREVENTION = 128).
  • Over-Window Batch Size: Processes 512 rows per batch (MAGIC_BATCH_SIZE = 512).
  • Sort Buffer (EOWC): 2048 entries (CACHE_CAPACITY = 2048).
  • Trade-off: Larger caches reduce state store reads but increase memory usage. The eviction frequency (every 16 rows) is a balance between eviction overhead and memory bounds.

Reasoning

Stateful operators in a streaming system must maintain working state for active keys. Without bounded caches, a hash join over a high-cardinality key space would consume unbounded memory. The 16-row eviction frequency was chosen to amortize the cost of checking cache entries while keeping memory bounded. More frequent eviction (every 1-4 rows) would add overhead; less frequent (every 100+ rows) could cause memory spikes.

The over-window jitter prevention of 128 entries (1/8 of cache size) prevents a common pathology where the working set size oscillates around the cache boundary, causing repeated cache fills and evictions. By keeping 128 extra entries beyond the strict limit, small fluctuations do not trigger full cache rebuilds.

Code evidence from src/stream/src/executor/hash_join.rs line 46:

const EVICT_EVERY_N_ROWS: u32 = 16;

Code evidence from src/stream/src/executor/over_window/range_cache.rs lines 191-192:

const MAGIC_CACHE_SIZE: usize = 1024;
const MAGIC_JITTER_PREVENTION: usize = MAGIC_CACHE_SIZE / 8;  // = 128

Code evidence from src/stream/src/executor/over_window/over_partition.rs line 107:

const MAGIC_BATCH_SIZE: usize = 512;

Code evidence from src/stream/src/executor/eowc/sort_buffer.rs line 62:

const CACHE_CAPACITY: usize = 2048;

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment