Heuristic:BerriAI Litellm Batch Size Flush Interval Tuning

Knowledge Sources	BerriAI/litellm Production logging
Domains	Observability, Optimization
Last Updated	2026-02-15 16:00 GMT

Overview

Batch logging optimization using 512-item batches with 5-10 second flush intervals to balance throughput and delivery latency across all logging backends.

Description

LiteLLM's observability integrations (LangSmith, Langfuse, S3, SQS, PostHog, etc.) use a CustomBatchLogger base class that accumulates log events in memory and flushes them periodically. This batching dramatically reduces I/O overhead compared to sending each event individually. The same pattern is used for Redis cache writes, where events are accumulated and batch-written to reduce round-trips.

Usage

Apply this heuristic when configuring observability integrations or tuning Redis cache performance. The default values work well for most deployments. For high-traffic proxies (>1K RPM), consider increasing batch size. For latency-sensitive monitoring, reduce the flush interval.

The Insight (Rule of Thumb)

Logging Backends:
- `DEFAULT_BATCH_SIZE=512` items per flush
- `DEFAULT_FLUSH_INTERVAL_SECONDS=5` for most backends
- `DEFAULT_S3_FLUSH_INTERVAL_SECONDS=10` for S3 (higher latency)
- `DEFAULT_SQS_FLUSH_INTERVAL_SECONDS=10` for SQS (higher latency)
Redis Cache Batch Writes:
- `redis_flush_size=100` items before batch write
- `socket_timeout=5.0` seconds default
Logging Worker:
- `LOGGING_WORKER_CONCURRENCY=100` concurrent async tasks
- `LOGGING_WORKER_MAX_QUEUE_SIZE=50_000` max queued items
- `LOGGING_WORKER_MAX_TIME_PER_COROUTINE=20.0` seconds timeout per task
Trade-off: Larger batches and longer intervals reduce I/O overhead but increase delivery latency and memory usage. Smaller values give faster delivery at higher I/O cost.

Reasoning

The 512-item batch size was chosen as a balance between minimizing HTTP request overhead (fewer but larger payloads) and keeping memory usage bounded (512 log entries typically fit in a few MB). The 5-second flush interval ensures logs are delivered within seconds even at low traffic, while the 10-second interval for S3/SQS accounts for higher per-request latency of AWS services. The Redis batch write threshold of 100 items is lower because Redis operations are much faster than HTTP calls, so smaller batches still provide good throughput.

Code Evidence

Batch defaults from `litellm/constants.py:14-22`:

DEFAULT_BATCH_SIZE = int(os.getenv("DEFAULT_BATCH_SIZE", 512))
DEFAULT_FLUSH_INTERVAL_SECONDS = int(os.getenv("DEFAULT_FLUSH_INTERVAL_SECONDS", 5))
DEFAULT_S3_FLUSH_INTERVAL_SECONDS = int(
    os.getenv("DEFAULT_S3_FLUSH_INTERVAL_SECONDS", 10)
)
DEFAULT_S3_BATCH_SIZE = int(os.getenv("DEFAULT_S3_BATCH_SIZE", 512))
DEFAULT_SQS_FLUSH_INTERVAL_SECONDS = int(
    os.getenv("DEFAULT_SQS_FLUSH_INTERVAL_SECONDS", 10)
)

CustomBatchLogger usage from `litellm/integrations/custom_batch_logger.py:29-30`:

self.flush_interval = flush_interval or litellm.DEFAULT_FLUSH_INTERVAL_SECONDS
self.batch_size = batch_size or litellm.DEFAULT_BATCH_SIZE

Redis batch write from `litellm/caching/redis_cache.py:136-141`:

redis_flush_size: int = 100
# for high traffic, we store the redis results in memory
# and then batch write to redis

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment