Implementation:BerriAI Litellm Cache Key Generator
| Knowledge Sources | https://github.com/BerriAI/litellm |
|---|---|
| Domains | Caching, Cryptography, LLM Infrastructure |
| Last Updated | 2026-02-15 |
Overview
Concrete methods for generating deterministic, SHA-256-hashed cache keys from LLM request parameters provided by the Cache class.
Description
The cache key generation system in LiteLLM consists of two primary methods on the Cache class:
get_cache_key(**kwargs): The public method that builds a raw cache key string from LLM request parameters, hashes it, adds a namespace prefix, and memoises the result. It iterates over all known LLM API parameters (retrieved viaModelParamHelper._get_all_llm_api_params()) and concatenates their values. For themodelparameter, it resolves caching groups and model groups to ensure cross-deployment cache sharing. For thefileparameter (used in transcription), it uses file checksums or names. Provider-specific optional parameters are included only when thelitellm.enable_caching_on_provider_specific_optional_paramsfeature flag is enabled.
_get_hashed_cache_key(cache_key): A static method that takes the raw concatenated key string and produces a SHA-256 hexadecimal digest. This ensures fixed-length, collision-resistant keys suitable for any backend.
Supporting private methods include _get_param_value, _get_model_param_value, _get_caching_group, _get_file_param_value, _get_preset_cache_key_from_kwargs, _set_preset_cache_key_in_kwargs, and _add_namespace_to_cache_key.
Usage
Call Cache.get_cache_key(**kwargs) whenever you need to compute or retrieve the cache key for a given LLM request. This is called internally by Cache.get_cache and Cache.add_cache, but can also be called directly for debugging or logging purposes.
Code Reference
| Source Location | litellm/caching/caching.py, lines 266-430
|
|---|---|
| Signature (get_cache_key) | Cache.get_cache_key(self, **kwargs) -> str
|
| Signature (_get_hashed_cache_key) | Cache._get_hashed_cache_key(cache_key: str) -> str (static method)
|
| Import | from litellm.caching.caching import Cache
|
Key source excerpt:
def get_cache_key(self, **kwargs) -> str:
cache_key = ""
preset_cache_key = self._get_preset_cache_key_from_kwargs(**kwargs)
if preset_cache_key is not None:
return preset_cache_key
combined_kwargs = ModelParamHelper._get_all_llm_api_params()
litellm_param_kwargs = all_litellm_params
for param in kwargs:
if param in combined_kwargs:
param_value = self._get_param_value(param, kwargs)
if param_value is not None:
cache_key += f"{str(param)}: {str(param_value)}"
elif param not in litellm_param_kwargs:
if litellm.enable_caching_on_provider_specific_optional_params is True:
if kwargs[param] is None:
continue
param_value = kwargs[param]
cache_key += f"{str(param)}: {str(param_value)}"
hashed_cache_key = Cache._get_hashed_cache_key(cache_key)
hashed_cache_key = self._add_namespace_to_cache_key(hashed_cache_key, **kwargs)
self._set_preset_cache_key_in_kwargs(
preset_cache_key=hashed_cache_key, **kwargs
)
return hashed_cache_key
@staticmethod
def _get_hashed_cache_key(cache_key: str) -> str:
hash_object = hashlib.sha256(cache_key.encode())
hash_hex = hash_object.hexdigest()
return hash_hex
I/O Contract
Inputs for get_cache_key:
| Parameter | Type | Description |
|---|---|---|
**kwargs |
Dict[str, Any] |
The keyword arguments passed to a litellm.completion(), litellm.embedding(), or other LLM API call. Relevant keys include model, messages, temperature, top_p, tools, input, file, metadata, litellm_params, and cache (for dynamic cache control).
|
Outputs for get_cache_key:
| Return Type | Description |
|---|---|
str |
A SHA-256 hex digest string (64 characters), optionally prefixed with {namespace}:. Example: "prod-v1:a3f2b8c9d4e5..."
|
Inputs for _get_hashed_cache_key:
| Parameter | Type | Description |
|---|---|---|
cache_key |
str |
The raw, unhashed cache key string (concatenated parameter names and values) |
Outputs for _get_hashed_cache_key:
| Return Type | Description |
|---|---|
str |
A 64-character hexadecimal SHA-256 digest |
Usage Examples
Computing a cache key for a completion request:
from litellm.caching.caching import Cache
from litellm.types.caching import LiteLLMCacheType
cache = Cache(type=LiteLLMCacheType.LOCAL)
cache_key = cache.get_cache_key(
model="gpt-4",
messages=[{"role": "user", "content": "Hello, world!"}],
temperature=0.7,
)
print(cache_key)
# e.g., "a3f2b8c9d4e5f6a7b8c9d0e1f2a3b4c5d6e7f8a9b0c1d2e3f4a5b6c7d8e9f0a1"
Using namespace isolation:
from litellm.caching.caching import Cache
from litellm.types.caching import LiteLLMCacheType
cache = Cache(type=LiteLLMCacheType.REDIS, host="localhost", port="6379", namespace="tenant-42")
cache_key = cache.get_cache_key(
model="gpt-4",
messages=[{"role": "user", "content": "Hello"}],
)
print(cache_key)
# e.g., "tenant-42:a3f2b8c9d4e5..."
Dynamic namespace via per-request cache control:
cache_key = cache.get_cache_key(
model="gpt-4",
messages=[{"role": "user", "content": "Hello"}],
cache={"namespace": "experiment-7"},
)
# The dynamic namespace overrides the instance-level namespace
print(cache_key)
# e.g., "experiment-7:a3f2b8c9d4e5..."
Hashing a raw key directly:
from litellm.caching.caching import Cache
raw_key = "model: gpt-4messages: [{'role': 'user', 'content': 'Hello'}]temperature: 0.7"
hashed = Cache._get_hashed_cache_key(raw_key)
print(hashed)
# "b7e23ec29af22b0b4e41da31e868d57226121c84..."
Cross-model caching via caching groups:
cache_key = cache.get_cache_key(
model="gpt-4-deployment-a",
messages=[{"role": "user", "content": "Hello"}],
metadata={
"model_group": "gpt-4",
"caching_groups": [["gpt-4", "gpt-4-turbo"]],
},
)
# Uses "['gpt-4', 'gpt-4-turbo']" as the model component in the key,
# so gpt-4 and gpt-4-turbo share cache entries.