Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:BerriAI Litellm Cache Init

From Leeroopedia
Knowledge Sources https://github.com/BerriAI/litellm
Domains Caching, System Configuration, LLM Infrastructure
Last Updated 2026-02-15

Overview

Concrete constructor for initializing and configuring the LiteLLM response caching subsystem provided by the Cache class.

Description

The Cache.__init__ method is the central entry point for setting up LiteLLM's response caching. Based on the type parameter (a LiteLLMCacheType enum), it instantiates one of eight concrete backend implementations: InMemoryCache, RedisCache, RedisClusterCache, RedisSemanticCache, QdrantSemanticCache, S3Cache, GCSCache, AzureBlobCache, or DiskCache. It then registers the cache with LiteLLM's callback system (input, success, and async success callbacks), stores operational parameters (TTL, namespace, supported call types, mode), and applies backend-specific TTL overrides. For Redis backends, it also propagates the namespace to the underlying cache object. The mode parameter controls whether caching is on by default (default_on) or opt-in (default_off).

Usage

Import and instantiate the Cache class at application startup. Assign it to litellm.cache to enable response caching globally across all LiteLLM API calls.

Code Reference

Source Location litellm/caching/caching.py, lines 56-264
Signature Cache.__init__(self, type=LiteLLMCacheType.LOCAL, mode=CacheMode.default_on, host=None, port=None, password=None, namespace=None, ttl=None, default_in_memory_ttl=None, default_in_redis_ttl=None, similarity_threshold=None, supported_call_types=[...], azure_account_url=None, azure_blob_container=None, s3_bucket_name=None, s3_region_name=None, s3_api_version=None, s3_use_ssl=True, s3_verify=None, s3_endpoint_url=None, s3_aws_access_key_id=None, s3_aws_secret_access_key=None, s3_aws_session_token=None, s3_config=None, s3_path=None, gcs_bucket_name=None, gcs_path_service_account=None, gcs_path=None, redis_semantic_cache_embedding_model="text-embedding-ada-002", redis_semantic_cache_index_name=None, redis_flush_size=None, redis_startup_nodes=None, disk_cache_dir=None, qdrant_api_base=None, qdrant_api_key=None, qdrant_collection_name=None, qdrant_quantization_config=None, qdrant_semantic_cache_embedding_model="text-embedding-ada-002", gcp_service_account=None, gcp_ssl_ca_certs=None, **kwargs)
Import from litellm.caching.caching import Cache

I/O Contract

Inputs:

Parameter Type Default Description
type Optional[LiteLLMCacheType] LiteLLMCacheType.LOCAL The cache backend to use
mode Optional[CacheMode] CacheMode.default_on Whether caching is on by default or opt-in
host Optional[str] None Redis host address
port Optional[str] None Redis port number
password Optional[str] None Redis password
namespace Optional[str] None Namespace prefix for cache keys
ttl Optional[float] None Default time-to-live in seconds for cached entries
default_in_memory_ttl Optional[float] None TTL override for in-memory (LOCAL) cache
default_in_redis_ttl Optional[float] None TTL override for Redis-based caches
similarity_threshold Optional[float] None Cosine similarity threshold for semantic caching
supported_call_types Optional[List[CachingSupportedCallTypes]] All call types Which LLM call types to cache
redis_startup_nodes Optional[List] None Startup nodes for Redis Cluster mode
redis_flush_size Optional[int] None Number of keys to flush at a time in batch operations
s3_bucket_name Optional[str] None S3 bucket name for S3 cache
gcs_bucket_name Optional[str] None GCS bucket name for GCS cache
disk_cache_dir Optional[str] None Directory for disk-based cache
**kwargs Any -- Additional keyword arguments passed to the underlying cache backend constructor (e.g., redis.Redis() kwargs)

Outputs:

Return Type Description
None The constructor modifies the instance in place. After initialization, self.cache holds the concrete backend, and LiteLLM callbacks are registered.

Usage Examples

Basic in-memory cache (default):

import litellm
from litellm.caching.caching import Cache

# Simplest initialization -- uses in-memory cache
litellm.cache = Cache()

Redis cache with custom TTL and namespace:

import litellm
from litellm.caching.caching import Cache
from litellm.types.caching import LiteLLMCacheType

litellm.cache = Cache(
    type=LiteLLMCacheType.REDIS,
    host="redis.example.com",
    port="6379",
    password="my-secret",
    namespace="prod-v1",
    ttl=600.0,  # 10 minutes
)

Redis Cluster with GCP IAM authentication:

import litellm
from litellm.caching.caching import Cache
from litellm.types.caching import LiteLLMCacheType

litellm.cache = Cache(
    type=LiteLLMCacheType.REDIS,
    host="redis-cluster.internal",
    port="6379",
    password="cluster-pass",
    redis_startup_nodes=[
        {"host": "node1.internal", "port": "6379"},
        {"host": "node2.internal", "port": "6379"},
    ],
    gcp_service_account="/path/to/sa.json",
    gcp_ssl_ca_certs="/path/to/ca.pem",
)

Semantic cache with Redis and a similarity threshold:

import litellm
from litellm.caching.caching import Cache
from litellm.types.caching import LiteLLMCacheType

litellm.cache = Cache(
    type=LiteLLMCacheType.REDIS_SEMANTIC,
    host="redis.example.com",
    port="6379",
    password="my-secret",
    similarity_threshold=0.8,
    redis_semantic_cache_embedding_model="text-embedding-ada-002",
)

S3 cache with specific call types:

import litellm
from litellm.caching.caching import Cache
from litellm.types.caching import LiteLLMCacheType

litellm.cache = Cache(
    type=LiteLLMCacheType.S3,
    s3_bucket_name="my-llm-cache",
    s3_region_name="us-east-1",
    s3_path="cache/v1/",
    supported_call_types=["completion", "acompletion"],
    ttl=3600.0,
)

Opt-in caching mode (default off):

import litellm
from litellm.caching.caching import Cache, CacheMode
from litellm.types.caching import LiteLLMCacheType

litellm.cache = Cache(
    type=LiteLLMCacheType.LOCAL,
    mode=CacheMode.default_off,
)
# Now caching only occurs when the caller explicitly sets caching=True

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment