Heuristic:Risingwavelabs Risingwave Memory Cache Eviction Policy
| Knowledge Sources | |
|---|---|
| Domains | Streaming, Memory_Management, Optimization |
| Last Updated | 2026-02-09 08:00 GMT |
Overview
Stream executors (hash join, over-window) use bounded in-memory caches with explicit eviction policies: hash joins evict every 16 rows, over-window caches hold 1024 entries with 128-entry jitter prevention, and batch processing uses 512-row batches.
Description
RisingWave's stateful streaming executors maintain in-memory caches to avoid repeated state store lookups. Each executor type has independently tuned cache parameters based on its access patterns. The hash join executor triggers eviction every 16 rows to prevent unbounded memory growth during long-running joins. The over-window executor uses a range cache of 1024 entries with a jitter prevention buffer of 128 entries (1/8 of cache size) to prevent thrashing when the working set size is near the cache boundary. Over-window operations are batched in groups of 512 rows for efficiency. The emit-on-window-close sort buffer maintains a 2048-entry cache.
Usage
Apply this heuristic when diagnosing memory issues in stateful stream operators, investigating hash join performance, or tuning over-window queries with large partitions. These cache sizes directly affect the trade-off between memory usage and state store I/O.
The Insight (Rule of Thumb)
- Hash Join Eviction: Evicts stale cache entries every 16 rows received (
EVICT_EVERY_N_ROWS = 16). - Over-Window Cache: Maintains 1024 entries (
MAGIC_CACHE_SIZE = 1024) with 128-entry jitter buffer (MAGIC_JITTER_PREVENTION = 128). - Over-Window Batch Size: Processes 512 rows per batch (
MAGIC_BATCH_SIZE = 512). - Sort Buffer (EOWC): 2048 entries (
CACHE_CAPACITY = 2048). - Trade-off: Larger caches reduce state store reads but increase memory usage. The eviction frequency (every 16 rows) is a balance between eviction overhead and memory bounds.
Reasoning
Stateful operators in a streaming system must maintain working state for active keys. Without bounded caches, a hash join over a high-cardinality key space would consume unbounded memory. The 16-row eviction frequency was chosen to amortize the cost of checking cache entries while keeping memory bounded. More frequent eviction (every 1-4 rows) would add overhead; less frequent (every 100+ rows) could cause memory spikes.
The over-window jitter prevention of 128 entries (1/8 of cache size) prevents a common pathology where the working set size oscillates around the cache boundary, causing repeated cache fills and evictions. By keeping 128 extra entries beyond the strict limit, small fluctuations do not trigger full cache rebuilds.
Code evidence from src/stream/src/executor/hash_join.rs line 46:
const EVICT_EVERY_N_ROWS: u32 = 16;
Code evidence from src/stream/src/executor/over_window/range_cache.rs lines 191-192:
const MAGIC_CACHE_SIZE: usize = 1024;
const MAGIC_JITTER_PREVENTION: usize = MAGIC_CACHE_SIZE / 8; // = 128
Code evidence from src/stream/src/executor/over_window/over_partition.rs line 107:
const MAGIC_BATCH_SIZE: usize = 512;
Code evidence from src/stream/src/executor/eowc/sort_buffer.rs line 62:
const CACHE_CAPACITY: usize = 2048;