Implementation:BerriAI Litellm Dual Cache

Attribute	Value
Sources	`litellm/caching/dual_cache.py`
Domains	Caching, Redis, In-Memory Cache, Performance
Last Updated	2026-02-15 16:00 GMT

Overview

The DualCache is a two-tier caching implementation that writes to both an in-memory cache and a Redis cache simultaneously, ensuring local reads are fast while maintaining distributed consistency.

Description

DualCache extends BaseCache to provide a write-through caching strategy. On writes, data is stored in both the in-memory cache and Redis. On reads, the in-memory cache is checked first (for speed), and if not found, Redis is queried (and the result is back-populated into the in-memory cache). The class supports both sync and async operations, batch get/set operations, atomic increment operations (single and pipeline), set-add operations (SADD), TTL management, cache deletion, and cache flushing. It includes a throttling mechanism for batch Redis reads using a LimitedSizeOrderedDict that tracks last access times to avoid excessive Redis calls for the same keys within a configurable expiry window. The local_only flag allows operations to skip Redis entirely.

Usage

Import DualCache when you need a fast local cache backed by a shared Redis cache. This is used extensively within LiteLLM's proxy server for rate limiting, API key validation, model routing, and other hot-path operations.

Code Reference

Source Location

litellm/caching/dual_cache.py

Signature

class DualCache(BaseCache):
    def __init__(
        self,
        in_memory_cache: Optional[InMemoryCache] = None,
        redis_cache: Optional[RedisCache] = None,
        default_in_memory_ttl: Optional[float] = None,
        default_redis_ttl: Optional[float] = None,
        default_redis_batch_cache_expiry: Optional[float] = None,
        default_max_redis_batch_cache_size: int = DEFAULT_MAX_REDIS_BATCH_CACHE_SIZE,
    ) -> None

Import

from litellm.caching.dual_cache import DualCache

I/O Contract

Inputs

Parameter	Type	Required	Description
`in_memory_cache`	`Optional[InMemoryCache]`	No	In-memory cache instance. Defaults to a new `InMemoryCache()`.
`redis_cache`	`Optional[RedisCache]`	No	Redis cache instance. If `None`, only in-memory caching is used.
`default_in_memory_ttl`	`Optional[float]`	No	Default TTL for in-memory cache entries. Falls back to `litellm.default_in_memory_ttl`.
`default_redis_ttl`	`Optional[float]`	No	Default TTL for Redis cache entries. Falls back to `litellm.default_redis_ttl`.
`default_redis_batch_cache_expiry`	`Optional[float]`	No	Throttle window for batch Redis reads (seconds). Defaults to 10.
`default_max_redis_batch_cache_size`	`int`	No	Max size of the Redis batch access time tracking dict. Defaults to `DEFAULT_MAX_REDIS_BATCH_CACHE_SIZE`.

Key Methods

Method	Returns	Description
`set_cache(key, value, local_only=False, **kwargs)`	`None`	Sync write to both caches.
`get_cache(key, parent_otel_span=None, local_only=False, **kwargs)`	`Any`	Sync read from in-memory first, then Redis.
`async_set_cache(key, value, local_only=False, **kwargs)`	`None`	Async write to both caches.
`async_get_cache(key, parent_otel_span=None, local_only=False, **kwargs)`	`Any`	Async read from in-memory first, then Redis.
`async_batch_get_cache(keys, parent_otel_span=None, local_only=False, **kwargs)`	`List[Any]`	Batch async read with Redis throttling.
`async_set_cache_pipeline(cache_list, local_only=False, **kwargs)`	`None`	Batch async write to both caches.
`async_increment_cache(key, value, local_only=False, **kwargs)`	`float`	Async atomic increment in both caches.
`increment_cache(key, value, local_only=False, **kwargs)`	`int`	Sync atomic increment in both caches.
`delete_cache(key)`	`None`	Delete key from both caches.
`flush_cache()`	`None`	Flush both caches entirely.

Outputs

Output	Type	Description
Cached value	`Any`	Returns the cached value from in-memory or Redis, or `None` if not found.

Usage Examples

from litellm.caching.dual_cache import DualCache
from litellm.caching.in_memory_cache import InMemoryCache
from litellm.caching.redis_cache import RedisCache

# Create a dual cache with both tiers
cache = DualCache(
    in_memory_cache=InMemoryCache(),
    redis_cache=RedisCache(host="localhost", port=6379),
    default_in_memory_ttl=60,
    default_redis_ttl=3600,
)

# Sync operations
cache.set_cache("my-key", "my-value")
result = cache.get_cache("my-key")

# Local-only operation (skips Redis)
cache.set_cache("local-key", "local-value", local_only=True)

# Async operations
await cache.async_set_cache("async-key", {"data": "value"})
result = await cache.async_get_cache("async-key")

# Batch operations
results = await cache.async_batch_get_cache(["key1", "key2", "key3"])

Related Pages

BerriAI_Litellm_Redis_Semantic_Cache - a Redis-backed semantic similarity cache
BerriAI_Litellm_Humanloop_Integration - uses DualCache for prompt template caching

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment