Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:BerriAI Litellm Cache Key Generator

From Leeroopedia
Knowledge Sources https://github.com/BerriAI/litellm
Domains Caching, Cryptography, LLM Infrastructure
Last Updated 2026-02-15

Overview

Concrete methods for generating deterministic, SHA-256-hashed cache keys from LLM request parameters provided by the Cache class.

Description

The cache key generation system in LiteLLM consists of two primary methods on the Cache class:

  • get_cache_key(**kwargs): The public method that builds a raw cache key string from LLM request parameters, hashes it, adds a namespace prefix, and memoises the result. It iterates over all known LLM API parameters (retrieved via ModelParamHelper._get_all_llm_api_params()) and concatenates their values. For the model parameter, it resolves caching groups and model groups to ensure cross-deployment cache sharing. For the file parameter (used in transcription), it uses file checksums or names. Provider-specific optional parameters are included only when the litellm.enable_caching_on_provider_specific_optional_params feature flag is enabled.
  • _get_hashed_cache_key(cache_key): A static method that takes the raw concatenated key string and produces a SHA-256 hexadecimal digest. This ensures fixed-length, collision-resistant keys suitable for any backend.

Supporting private methods include _get_param_value, _get_model_param_value, _get_caching_group, _get_file_param_value, _get_preset_cache_key_from_kwargs, _set_preset_cache_key_in_kwargs, and _add_namespace_to_cache_key.

Usage

Call Cache.get_cache_key(**kwargs) whenever you need to compute or retrieve the cache key for a given LLM request. This is called internally by Cache.get_cache and Cache.add_cache, but can also be called directly for debugging or logging purposes.

Code Reference

Source Location litellm/caching/caching.py, lines 266-430
Signature (get_cache_key) Cache.get_cache_key(self, **kwargs) -> str
Signature (_get_hashed_cache_key) Cache._get_hashed_cache_key(cache_key: str) -> str (static method)
Import from litellm.caching.caching import Cache

Key source excerpt:

def get_cache_key(self, **kwargs) -> str:
    cache_key = ""
    preset_cache_key = self._get_preset_cache_key_from_kwargs(**kwargs)
    if preset_cache_key is not None:
        return preset_cache_key

    combined_kwargs = ModelParamHelper._get_all_llm_api_params()
    litellm_param_kwargs = all_litellm_params
    for param in kwargs:
        if param in combined_kwargs:
            param_value = self._get_param_value(param, kwargs)
            if param_value is not None:
                cache_key += f"{str(param)}: {str(param_value)}"
        elif param not in litellm_param_kwargs:
            if litellm.enable_caching_on_provider_specific_optional_params is True:
                if kwargs[param] is None:
                    continue
                param_value = kwargs[param]
                cache_key += f"{str(param)}: {str(param_value)}"

    hashed_cache_key = Cache._get_hashed_cache_key(cache_key)
    hashed_cache_key = self._add_namespace_to_cache_key(hashed_cache_key, **kwargs)
    self._set_preset_cache_key_in_kwargs(
        preset_cache_key=hashed_cache_key, **kwargs
    )
    return hashed_cache_key

@staticmethod
def _get_hashed_cache_key(cache_key: str) -> str:
    hash_object = hashlib.sha256(cache_key.encode())
    hash_hex = hash_object.hexdigest()
    return hash_hex

I/O Contract

Inputs for get_cache_key:

Parameter Type Description
**kwargs Dict[str, Any] The keyword arguments passed to a litellm.completion(), litellm.embedding(), or other LLM API call. Relevant keys include model, messages, temperature, top_p, tools, input, file, metadata, litellm_params, and cache (for dynamic cache control).

Outputs for get_cache_key:

Return Type Description
str A SHA-256 hex digest string (64 characters), optionally prefixed with {namespace}:. Example: "prod-v1:a3f2b8c9d4e5..."

Inputs for _get_hashed_cache_key:

Parameter Type Description
cache_key str The raw, unhashed cache key string (concatenated parameter names and values)

Outputs for _get_hashed_cache_key:

Return Type Description
str A 64-character hexadecimal SHA-256 digest

Usage Examples

Computing a cache key for a completion request:

from litellm.caching.caching import Cache
from litellm.types.caching import LiteLLMCacheType

cache = Cache(type=LiteLLMCacheType.LOCAL)

cache_key = cache.get_cache_key(
    model="gpt-4",
    messages=[{"role": "user", "content": "Hello, world!"}],
    temperature=0.7,
)
print(cache_key)
# e.g., "a3f2b8c9d4e5f6a7b8c9d0e1f2a3b4c5d6e7f8a9b0c1d2e3f4a5b6c7d8e9f0a1"

Using namespace isolation:

from litellm.caching.caching import Cache
from litellm.types.caching import LiteLLMCacheType

cache = Cache(type=LiteLLMCacheType.REDIS, host="localhost", port="6379", namespace="tenant-42")

cache_key = cache.get_cache_key(
    model="gpt-4",
    messages=[{"role": "user", "content": "Hello"}],
)
print(cache_key)
# e.g., "tenant-42:a3f2b8c9d4e5..."

Dynamic namespace via per-request cache control:

cache_key = cache.get_cache_key(
    model="gpt-4",
    messages=[{"role": "user", "content": "Hello"}],
    cache={"namespace": "experiment-7"},
)
# The dynamic namespace overrides the instance-level namespace
print(cache_key)
# e.g., "experiment-7:a3f2b8c9d4e5..."

Hashing a raw key directly:

from litellm.caching.caching import Cache

raw_key = "model: gpt-4messages: [{'role': 'user', 'content': 'Hello'}]temperature: 0.7"
hashed = Cache._get_hashed_cache_key(raw_key)
print(hashed)
# "b7e23ec29af22b0b4e41da31e868d57226121c84..."

Cross-model caching via caching groups:

cache_key = cache.get_cache_key(
    model="gpt-4-deployment-a",
    messages=[{"role": "user", "content": "Hello"}],
    metadata={
        "model_group": "gpt-4",
        "caching_groups": [["gpt-4", "gpt-4-turbo"]],
    },
)
# Uses "['gpt-4', 'gpt-4-turbo']" as the model component in the key,
# so gpt-4 and gpt-4-turbo share cache entries.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment