Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:BerriAI Litellm Dual Cache

From Leeroopedia
Revision as of 12:09, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/BerriAI_Litellm_Dual_Cache.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Attribute Value
Sources litellm/caching/dual_cache.py
Domains Caching, Redis, In-Memory Cache, Performance
Last Updated 2026-02-15 16:00 GMT

Overview

The DualCache is a two-tier caching implementation that writes to both an in-memory cache and a Redis cache simultaneously, ensuring local reads are fast while maintaining distributed consistency.

Description

DualCache extends BaseCache to provide a write-through caching strategy. On writes, data is stored in both the in-memory cache and Redis. On reads, the in-memory cache is checked first (for speed), and if not found, Redis is queried (and the result is back-populated into the in-memory cache). The class supports both sync and async operations, batch get/set operations, atomic increment operations (single and pipeline), set-add operations (SADD), TTL management, cache deletion, and cache flushing. It includes a throttling mechanism for batch Redis reads using a LimitedSizeOrderedDict that tracks last access times to avoid excessive Redis calls for the same keys within a configurable expiry window. The local_only flag allows operations to skip Redis entirely.

Usage

Import DualCache when you need a fast local cache backed by a shared Redis cache. This is used extensively within LiteLLM's proxy server for rate limiting, API key validation, model routing, and other hot-path operations.

Code Reference

Source Location

litellm/caching/dual_cache.py

Signature

class DualCache(BaseCache):
    def __init__(
        self,
        in_memory_cache: Optional[InMemoryCache] = None,
        redis_cache: Optional[RedisCache] = None,
        default_in_memory_ttl: Optional[float] = None,
        default_redis_ttl: Optional[float] = None,
        default_redis_batch_cache_expiry: Optional[float] = None,
        default_max_redis_batch_cache_size: int = DEFAULT_MAX_REDIS_BATCH_CACHE_SIZE,
    ) -> None

Import

from litellm.caching.dual_cache import DualCache

I/O Contract

Inputs

Parameter Type Required Description
in_memory_cache Optional[InMemoryCache] No In-memory cache instance. Defaults to a new InMemoryCache().
redis_cache Optional[RedisCache] No Redis cache instance. If None, only in-memory caching is used.
default_in_memory_ttl Optional[float] No Default TTL for in-memory cache entries. Falls back to litellm.default_in_memory_ttl.
default_redis_ttl Optional[float] No Default TTL for Redis cache entries. Falls back to litellm.default_redis_ttl.
default_redis_batch_cache_expiry Optional[float] No Throttle window for batch Redis reads (seconds). Defaults to 10.
default_max_redis_batch_cache_size int No Max size of the Redis batch access time tracking dict. Defaults to DEFAULT_MAX_REDIS_BATCH_CACHE_SIZE.

Key Methods

Method Returns Description
set_cache(key, value, local_only=False, **kwargs) None Sync write to both caches.
get_cache(key, parent_otel_span=None, local_only=False, **kwargs) Any Sync read from in-memory first, then Redis.
async_set_cache(key, value, local_only=False, **kwargs) None Async write to both caches.
async_get_cache(key, parent_otel_span=None, local_only=False, **kwargs) Any Async read from in-memory first, then Redis.
async_batch_get_cache(keys, parent_otel_span=None, local_only=False, **kwargs) List[Any] Batch async read with Redis throttling.
async_set_cache_pipeline(cache_list, local_only=False, **kwargs) None Batch async write to both caches.
async_increment_cache(key, value, local_only=False, **kwargs) float Async atomic increment in both caches.
increment_cache(key, value, local_only=False, **kwargs) int Sync atomic increment in both caches.
delete_cache(key) None Delete key from both caches.
flush_cache() None Flush both caches entirely.

Outputs

Output Type Description
Cached value Any Returns the cached value from in-memory or Redis, or None if not found.

Usage Examples

from litellm.caching.dual_cache import DualCache
from litellm.caching.in_memory_cache import InMemoryCache
from litellm.caching.redis_cache import RedisCache

# Create a dual cache with both tiers
cache = DualCache(
    in_memory_cache=InMemoryCache(),
    redis_cache=RedisCache(host="localhost", port=6379),
    default_in_memory_ttl=60,
    default_redis_ttl=3600,
)

# Sync operations
cache.set_cache("my-key", "my-value")
result = cache.get_cache("my-key")

# Local-only operation (skips Redis)
cache.set_cache("local-key", "local-value", local_only=True)
# Async operations
await cache.async_set_cache("async-key", {"data": "value"})
result = await cache.async_get_cache("async-key")

# Batch operations
results = await cache.async_batch_get_cache(["key1", "key2", "key3"])

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment