Implementation:EvolvingLMMs Lab Lmms eval Cache Utils
| Knowledge Sources | |
|---|---|
| Domains | Caching, Serialization |
| Last Updated | 2026-02-14 00:00 GMT |
Overview
Cache Utils provides disk-based caching functionality for evaluation requests and responses using pickle/dill serialization. It handles storage, retrieval, and deletion of cached evaluation data with special handling for non-serializable objects such as callable arguments.
Description
The module defines a cache directory (configurable via the LM_HARNESS_CACHE_PATH environment variable), generates unique file names using a SHA-256 hash suffix, and provides three operations: loading cached data, saving data to cache with callable-argument sanitization, and deleting cache files by key prefix. Serialization uses dill for broader type support than standard pickle, with a two-stage strategy that falls back to item-level serialization checking on failure.
Usage
Use cache utilities to avoid re-running expensive model inference when re-evaluating with the same model and task configuration. The cache key is typically formed as model_name_task_name.
Code Reference
Source Location
- Repository: EvolvingLMMs-Lab/lmms-eval
- File:
lmms_eval/caching/cache.py - Lines: 1--69
Key Components
Module Configuration
MODULE_DIR = os.path.dirname(os.path.realpath(__file__))
OVERRIDE_PATH = os.getenv("LM_HARNESS_CACHE_PATH")
PATH = OVERRIDE_PATH if OVERRIDE_PATH else f"{MODULE_DIR}/.cache"
HASH_INPUT = "EleutherAI-lm-evaluation-harness"
HASH_PREFIX = hashlib.sha256(HASH_INPUT.encode("utf-8")).hexdigest()
FILE_SUFFIX = f".{HASH_PREFIX}.pickle"
Purpose: Set up the cache directory path and unique file suffix.
Cache Location:
- Default:
.cachesubdirectory in the module directory - Override: set the
LM_HARNESS_CACHE_PATHenvironment variable
load_from_cache
def load_from_cache(file_name):
try:
path = f"{PATH}/{file_name}{FILE_SUFFIX}"
with open(path, "rb") as file:
cached_task_dict = dill.loads(file.read())
return cached_task_dict
except Exception:
eval_logger.debug(f"{file_name} is not cached, generating...")
pass
Purpose: Load cached evaluation data from disk.
Parameters:
file_name-- Base name for the cache file (without suffix)
Returns: The cached task dictionary, or None if not found.
save_to_cache
def save_to_cache(file_name, obj):
if not os.path.exists(PATH):
os.mkdir(PATH)
file_path = f"{PATH}/{file_name}{FILE_SUFFIX}"
serializable_obj = []
for item in obj:
for subitem in item:
if hasattr(subitem, "arguments"):
serializable_arguments = tuple(
arg if not callable(arg) else None for arg in subitem.arguments
)
subitem.arguments = serializable_arguments
eval_logger.debug(f"Saving {file_path} to cache...")
try:
with open(file_path, "wb") as file:
file.write(dill.dumps(serializable_obj))
except (pickle.PickleError, dill.PicklingError, TypeError, AttributeError):
with open(file_path, "wb") as file:
file.write(dill.dumps(
[[subitem if is_serializable(subitem)
else _handle_non_serializable(subitem)
for subitem in item]
for item in obj]
))
Purpose: Serialize and save evaluation data to disk cache.
Parameters:
file_name-- Base name for the cache fileobj-- Object to cache (typically a list of request groups)
Behavior:
- Creates cache directory if it does not exist
- Replaces callable arguments with
Noneto avoid serialization failures - Attempts primary serialization with dill; on failure, falls back to item-level serialization with
is_serializable/_handle_non_serializable
delete_cache
def delete_cache(key: str = ""):
files = os.listdir(PATH)
for file in files:
if file.startswith(key) and file.endswith(FILE_SUFFIX):
file_path = f"{PATH}/{file}"
os.unlink(file_path)
Purpose: Delete cached files matching a key prefix.
Parameters:
key-- Prefix to match (default:""matches all cache files)
I/O Contract
| Input | Type | Description |
|---|---|---|
| file_name | str |
Base name for cache file lookup or creation |
| obj | list[list[Request]] |
Nested list of request groups to cache |
| key | str |
Prefix filter for deletion |
| Output | Type | Description |
|---|---|---|
| cached_task_dict | Any or None |
Deserialized cache contents, or None on miss
|
Dependencies
hashlib-- Cache file suffix generationos-- File system operationspickle-- Standard serialization (for exception types)dill-- Enhanced serialization librarylmms_eval.loggers.utils--_handle_non_serializable,is_serializablelmms_eval.utils--eval_logger
Design Decisions
- Dill over pickle -- Uses dill for broader type support including lambda functions and local classes.
- Callable argument handling -- Replaces callables with
Nonerather than trying to serialize them, since many callables (likedoc_to_visual) are task methods that cannot be pickled. - Silent cache misses -- Returns
Noneon load failure without raising exceptions, since a cache miss is an expected scenario. - Two-stage serialization -- Optimistic attempt first, then defensive fallback with per-item serialization checking.
- Hash-based suffix -- Unique identifier prevents collisions when multiple frameworks share a cache directory.
Known Issues
serializable_objnot populated -- The variable is declared as an empty list but is never populated with data fromobj. The primary serialization path serializes an empty list, causing the fallback path to always execute.- In-place modification -- Arguments on the original
objare modified in place, which may surprise callers that expect the original data to be preserved. - Broad exception handling -- The load function catches all exceptions, making debugging difficult.