Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:EvolvingLMMs Lab Lmms eval Cache Utils

From Leeroopedia
Knowledge Sources
Domains Caching, Serialization
Last Updated 2026-02-14 00:00 GMT

Overview

Cache Utils provides disk-based caching functionality for evaluation requests and responses using pickle/dill serialization. It handles storage, retrieval, and deletion of cached evaluation data with special handling for non-serializable objects such as callable arguments.

Description

The module defines a cache directory (configurable via the LM_HARNESS_CACHE_PATH environment variable), generates unique file names using a SHA-256 hash suffix, and provides three operations: loading cached data, saving data to cache with callable-argument sanitization, and deleting cache files by key prefix. Serialization uses dill for broader type support than standard pickle, with a two-stage strategy that falls back to item-level serialization checking on failure.

Usage

Use cache utilities to avoid re-running expensive model inference when re-evaluating with the same model and task configuration. The cache key is typically formed as model_name_task_name.

Code Reference

Source Location

  • Repository: EvolvingLMMs-Lab/lmms-eval
  • File: lmms_eval/caching/cache.py
  • Lines: 1--69

Key Components

Module Configuration

MODULE_DIR = os.path.dirname(os.path.realpath(__file__))
OVERRIDE_PATH = os.getenv("LM_HARNESS_CACHE_PATH")
PATH = OVERRIDE_PATH if OVERRIDE_PATH else f"{MODULE_DIR}/.cache"

HASH_INPUT = "EleutherAI-lm-evaluation-harness"
HASH_PREFIX = hashlib.sha256(HASH_INPUT.encode("utf-8")).hexdigest()
FILE_SUFFIX = f".{HASH_PREFIX}.pickle"

Purpose: Set up the cache directory path and unique file suffix.

Cache Location:

  • Default: .cache subdirectory in the module directory
  • Override: set the LM_HARNESS_CACHE_PATH environment variable

load_from_cache

def load_from_cache(file_name):
    try:
        path = f"{PATH}/{file_name}{FILE_SUFFIX}"
        with open(path, "rb") as file:
            cached_task_dict = dill.loads(file.read())
            return cached_task_dict
    except Exception:
        eval_logger.debug(f"{file_name} is not cached, generating...")
        pass

Purpose: Load cached evaluation data from disk.

Parameters:

  • file_name -- Base name for the cache file (without suffix)

Returns: The cached task dictionary, or None if not found.

save_to_cache

def save_to_cache(file_name, obj):
    if not os.path.exists(PATH):
        os.mkdir(PATH)

    file_path = f"{PATH}/{file_name}{FILE_SUFFIX}"
    serializable_obj = []

    for item in obj:
        for subitem in item:
            if hasattr(subitem, "arguments"):
                serializable_arguments = tuple(
                    arg if not callable(arg) else None for arg in subitem.arguments
                )
                subitem.arguments = serializable_arguments

    eval_logger.debug(f"Saving {file_path} to cache...")
    try:
        with open(file_path, "wb") as file:
            file.write(dill.dumps(serializable_obj))
    except (pickle.PickleError, dill.PicklingError, TypeError, AttributeError):
        with open(file_path, "wb") as file:
            file.write(dill.dumps(
                [[subitem if is_serializable(subitem)
                  else _handle_non_serializable(subitem)
                  for subitem in item]
                 for item in obj]
            ))

Purpose: Serialize and save evaluation data to disk cache.

Parameters:

  • file_name -- Base name for the cache file
  • obj -- Object to cache (typically a list of request groups)

Behavior:

  • Creates cache directory if it does not exist
  • Replaces callable arguments with None to avoid serialization failures
  • Attempts primary serialization with dill; on failure, falls back to item-level serialization with is_serializable / _handle_non_serializable

delete_cache

def delete_cache(key: str = ""):
    files = os.listdir(PATH)
    for file in files:
        if file.startswith(key) and file.endswith(FILE_SUFFIX):
            file_path = f"{PATH}/{file}"
            os.unlink(file_path)

Purpose: Delete cached files matching a key prefix.

Parameters:

  • key -- Prefix to match (default: "" matches all cache files)

I/O Contract

Input Type Description
file_name str Base name for cache file lookup or creation
obj list[list[Request]] Nested list of request groups to cache
key str Prefix filter for deletion
Output Type Description
cached_task_dict Any or None Deserialized cache contents, or None on miss

Dependencies

  • hashlib -- Cache file suffix generation
  • os -- File system operations
  • pickle -- Standard serialization (for exception types)
  • dill -- Enhanced serialization library
  • lmms_eval.loggers.utils -- _handle_non_serializable, is_serializable
  • lmms_eval.utils -- eval_logger

Design Decisions

  • Dill over pickle -- Uses dill for broader type support including lambda functions and local classes.
  • Callable argument handling -- Replaces callables with None rather than trying to serialize them, since many callables (like doc_to_visual) are task methods that cannot be pickled.
  • Silent cache misses -- Returns None on load failure without raising exceptions, since a cache miss is an expected scenario.
  • Two-stage serialization -- Optimistic attempt first, then defensive fallback with per-item serialization checking.
  • Hash-based suffix -- Unique identifier prevents collisions when multiple frameworks share a cache directory.

Known Issues

  • serializable_obj not populated -- The variable is declared as an empty list but is never populated with data from obj. The primary serialization path serializes an empty list, causing the fallback path to always execute.
  • In-place modification -- Arguments on the original obj are modified in place, which may surprise callers that expect the original data to be preserved.
  • Broad exception handling -- The load function catches all exceptions, making debugging difficult.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment