Implementation:Openai Evals PersistentMemoryCache

Knowledge Sources	Openai_Evals
Domains	Evaluation, Solvers
Last Updated	2026-02-14 10:00 GMT

Overview

Concrete tool for maintaining private conversation state across multi-turn solver interactions provided by the evals library.

Description

This module defines two constructs: the Interaction dataclass and the PersistentMemoryCache class.

Interaction is a simple dataclass that stores a snapshot of conversation messages along with the indices of messages that are considered private (e.g., internal chain-of-thought reasoning). It has two fields: messages (the full list of Message objects) and private_messages_ids (a list of integer indices identifying which messages are private).

PersistentMemoryCache manages the saving and loading of these private messages across evaluation turns. During multi-turn evaluations, some solvers (such as CoTSolver and SelfConsistencySolver) generate intermediate reasoning messages that should be preserved for the solver but hidden from the eval harness. When save_private_interaction is called, it records the current conversation state and marks the most recent private messages (determined by interaction_length). When load_private_interaction is called on a subsequent turn, it reconstructs the full message history by re-inserting the private messages that the eval harness had stripped out, verifying consistency between the stored state and the new TaskState.

Usage

Import PersistentMemoryCache and Interaction when building solvers that need to remember private intermediate reasoning across multiple turns. This is essential for multi-turn chain-of-thought or self-consistency solvers where internal reasoning should persist but not be visible to the eval framework.

Code Reference

Source Location

Repository: Openai_Evals
File: evals/solvers/memory.py
Lines: 1-64

Signature

@dataclass
class Interaction:
    messages: List[Message]
    private_messages_ids: List[int]


class PersistentMemoryCache:
    def __init__(
        self,
        interaction_length: int,
    ):
        ...

    def save_private_interaction(self, task_state: TaskState):
        ...

    def load_private_interaction(self, task_state: TaskState) -> List[Message]:
        ...

Import

from evals.solvers.memory import PersistentMemoryCache, Interaction

I/O Contract

Inputs

Name	Type	Required	Description
interaction_length	int	Yes	Number of private messages to track per interaction turn. Determines how many recent messages are marked as private when saving.
task_state	TaskState	Yes	The current evaluation task state, passed to both save_private_interaction and load_private_interaction.

Outputs

Name	Type	Description
save_private_interaction	None	Side effect: stores the current interaction and marks private message indices internally.
load_private_interaction	List[Message]	Returns the reconstructed message list with private messages re-inserted. If no prior interaction exists, returns task_state.messages unchanged.

Usage Examples

from evals.solvers.memory import PersistentMemoryCache
from evals.task_state import TaskState, Message

# Create a cache that tracks 3 private messages per interaction
cache = PersistentMemoryCache(interaction_length=3)

# During a solve step, after generating private reasoning messages:
# task_state.messages now includes CoT prompts and reasoning output
cache.save_private_interaction(task_state)

# On the next turn, the eval harness provides a new task_state
# without the private messages. Restore them:
restored_messages = cache.load_private_interaction(new_task_state)
# restored_messages contains the full history including private reasoning

Related Pages

Environment:Openai_Evals_Python_Runtime

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment