Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Openai Evals OpenAIAssistantsSolver

From Leeroopedia
Knowledge Sources
Domains Evaluation, LLM Provider Integration
Last Updated 2026-02-14 10:00 GMT

Overview

Concrete solver for running evaluation tasks through the OpenAI Assistants API provided by the evals library.

Description

OpenAIAssistantsSolver is a Solver subclass that interfaces with the OpenAI Assistants API, providing stateful, thread-based conversations with optional tool use. Unlike the standard chat-completion solver, the Assistants API maintains server-side conversation state, supports built-in tools like code-interpreter and retrieval, and allows file attachments.

Key behaviours:

  • Stateful threads -- Each solver instance owns an Assistant and a Thread. Messages are appended to the thread incrementally; the solver tracks the index of the last assistant message so only new user messages are sent on each call to _solve.
  • Assistant creation -- On first initialisation, a new Assistant is created via the API with the given model, name, description, tools, and any uploaded file_ids. When copying (via copy()), the same Assistant is reused but a fresh Thread is created.
  • Tool support -- The tools parameter accepts a list of tool specification dictionaries (e.g. [{"type": "code_interpreter"}], [{"type": "retrieval"}]). These are passed directly to the Assistants API at creation time.
  • File management -- Files can be uploaded at two levels: (1) solver-wide files specified by file_paths at init, which are available to all threads; (2) thread-specific files passed via task_state.current_state["files"]. A module-level FILE_CACHE (protected by FILE_CACHE_LOCK) ensures each file path is uploaded only once, even across multiple solver instances running in parallel.
  • Retry logic -- The _run_assistant_retrying method is decorated with @backoff.on_exception to retry on transient OpenAI errors (RateLimitError, APIConnectionError, APITimeoutError, InternalServerError) with exponential back-off.
  • Run lifecycle -- After submitting messages, the solver creates a Run and polls its status via _wait_on_run (500ms intervals) until it reaches a terminal state. If the run does not complete successfully, an OpenAIError is raised to trigger retry.
  • Message conversion -- The Assistants API only accepts user-role messages. Non-user messages (e.g. system) are converted by prepending the role as a tag: [system] ....

Usage

Import OpenAIAssistantsSolver when evaluating tasks that benefit from stateful multi-turn interactions, code execution, or file retrieval. It is typically specified by class path in YAML eval configurations. Requires the OPENAI_API_KEY environment variable and the openai package.

Code Reference

Source Location

Signature

class OpenAIAssistantsSolver(Solver):
    def __init__(
        self,
        model: str,
        name: Optional[str] = None,
        description: Optional[str] = None,
        tools: list[Dict[str, Any]] = [],
        file_paths: list[str] = [],
        assistant: Optional[Assistant] = None,
        thread: Optional[Thread] = None,
        postprocessors: list[str] = [],
        registry: Any = None,
    ):
    def _run_assistant_retrying(self, task_state: TaskState):
    def _solve(self, task_state: TaskState, **kwargs) -> SolverResult:
    def copy(self) -> "OpenAIAssistantsSolver":
    def _create_file(self, file_path: str) -> str:
    def _create_files(self, file_paths: list[str]) -> list[str]:
    @property
    def name(self) -> str:
    @property
    def model_version(self) -> Union[str, dict]:

Import

from evals.solvers.providers.openai.openai_assistants_solver import OpenAIAssistantsSolver

I/O Contract

Inputs

Name Type Required Description
model str Yes OpenAI model identifier (e.g. "gpt-4-1106-preview").
name Optional[str] No (default None) Human-readable name for the Assistant.
description Optional[str] No (default None) Description of the Assistant's purpose.
tools list[Dict[str, Any]] No (default []) Tool specifications (e.g. [{"type": "code_interpreter"}]).
file_paths list[str] No (default []) Local file paths to upload and attach to the Assistant (available to all threads).
assistant Optional[Assistant] No (default None) Pre-existing Assistant object; used internally by copy(). When provided, name, description, tools, and file_paths must not be set.
thread Optional[Thread] No (default None) Pre-existing Thread object; if not provided, a new thread is created automatically.
postprocessors list[str] No (default []) Fully-qualified class paths of PostProcessor instances to apply to the output.
registry Any No (default None) Unused; accepted for interface compatibility.
task_state TaskState Yes (at solve time) The evaluation task state. The task_description is sent as the Run's instructions. Thread-specific files may be passed via task_state.current_state["files"].

Outputs

Name Type Description
result SolverResult Contains the Assistant's concatenated text response in output. Multiple content blocks (text or image placeholders) are joined with newlines.

Usage Examples

from evals.solvers.providers.openai.openai_assistants_solver import OpenAIAssistantsSolver
from evals.task_state import TaskState, Message

# Instantiate with code-interpreter tool
solver = OpenAIAssistantsSolver(
    model="gpt-4-1106-preview",
    name="Math Evaluator",
    description="Solves math problems using code execution.",
    tools=[{"type": "code_interpreter"}],
)

# Build a task state
task_state = TaskState(
    task_description="Solve the following math problem step by step.",
    messages=[
        Message(role="user", content="What is the integral of x^2 from 0 to 5?"),
    ],
)

# Solve the task
result = solver(task_state)
print(result.output)

# Copy solver for parallel evaluation (new Thread, same Assistant)
solver_copy = solver.copy()
# solver_copy.assistant.id == solver.assistant.id  (same assistant)
# solver_copy.thread.id != solver.thread.id        (different thread)

# Using file attachments for retrieval
solver_with_files = OpenAIAssistantsSolver(
    model="gpt-4-1106-preview",
    tools=[{"type": "retrieval"}],
    file_paths=["./data/reference_doc.pdf"],
)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment