Heuristic:Treeverse LakeFS Batch Delay Tuning

Knowledge Sources	lakeFS config defaults
Domains	Optimization, Performance
Last Updated	2026-02-08 10:00 GMT

Overview

The lakeFS server batches KV store operations with a 3ms maximum delay, balancing throughput (200-1000 req/s) against added latency on critical path queries.

Description

lakeFS uses a batching mechanism for metadata store (KV) operations to reduce the number of expensive queries. The MaxBatchDelay parameter controls the maximum time the server will wait to accumulate operations into a single batch before executing. The default of 3 milliseconds represents a careful trade-off: it enables effective batching under concurrent load (reducing database pressure) while keeping added latency imperceptible for typical interactive use cases.

Usage

Use this heuristic when tuning lakeFS server performance, diagnosing unexpectedly slow metadata operations, or configuring lakeFS for high-throughput workloads. Adjusting this value affects the trade-off between per-request latency and overall throughput.

The Insight (Rule of Thumb)

Action: Configure `graveler.max_batch_delay` based on your concurrency profile.
Value: Default is 3ms, representing 200-1000 req/s sweet spot.
Trade-off: Lower values (1ms) reduce latency but decrease batching effectiveness. Higher values (5ms) improve batching under heavy load but add noticeable latency to every metadata operation.
Guideline: Keep between 1-5ms. Below 1ms, batching becomes ineffective. Above 5ms, latency becomes perceptible.

Reasoning

The codebase comment explains the rationale explicitly: "Since reducing # of expensive operations is only beneficial when there are a lot of concurrent requests, the sweet spot is probably between 1-5 milliseconds (representing 200-1000 requests/second to the data store). 3ms of delay with ~300 requests/second per resource sounds like a reasonable tradeoff." This is a classic latency-vs-throughput optimization. At low concurrency, the batch delay adds unnecessary latency. At high concurrency, it dramatically reduces database load by combining multiple operations into single batch queries.

Code Evidence

Batch delay configuration from `pkg/config/defaults.go:162-169`:

// MaxBatchDelay - 3ms was chosen as a max delay time for critical path queries.
// It trades off amount of queries per second (and thus effectiveness of the batching mechanism) with added latency.
// Since reducing # of expensive operations is only beneficial when there are a lot of concurrent requests,
//
//  the sweet spot is probably between 1-5 milliseconds (representing 200-1000 requests/second to the data store).
//
// 3ms of delay with ~300 requests/second per resource sounds like a reasonable tradeoff.
viper.SetDefault("graveler.max_batch_delay", 3*time.Millisecond)

Related cache configuration from `pkg/config/defaults.go:155-160`:

viper.SetDefault("graveler.repository_cache.size", 1000)
viper.SetDefault("graveler.repository_cache.expiry", 5*time.Second)
viper.SetDefault("graveler.repository_cache.jitter", 2*time.Second)
viper.SetDefault("graveler.commit_cache.size", 50_000)
viper.SetDefault("graveler.commit_cache.expiry", 10*time.Minute)
viper.SetDefault("graveler.commit_cache.jitter", 2*time.Second)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment