Implementation:Openai Evals OpenAIAssistantsSolver
| Knowledge Sources | |
|---|---|
| Domains | Evaluation, LLM Provider Integration |
| Last Updated | 2026-02-14 10:00 GMT |
Overview
Concrete solver for running evaluation tasks through the OpenAI Assistants API provided by the evals library.
Description
OpenAIAssistantsSolver is a Solver subclass that interfaces with the OpenAI Assistants API, providing stateful, thread-based conversations with optional tool use. Unlike the standard chat-completion solver, the Assistants API maintains server-side conversation state, supports built-in tools like code-interpreter and retrieval, and allows file attachments.
Key behaviours:
- Stateful threads -- Each solver instance owns an Assistant and a Thread. Messages are appended to the thread incrementally; the solver tracks the index of the last assistant message so only new user messages are sent on each call to
_solve. - Assistant creation -- On first initialisation, a new Assistant is created via the API with the given
model,name,description,tools, and any uploadedfile_ids. When copying (viacopy()), the same Assistant is reused but a fresh Thread is created. - Tool support -- The
toolsparameter accepts a list of tool specification dictionaries (e.g.[{"type": "code_interpreter"}],[{"type": "retrieval"}]). These are passed directly to the Assistants API at creation time. - File management -- Files can be uploaded at two levels: (1) solver-wide files specified by
file_pathsat init, which are available to all threads; (2) thread-specific files passed viatask_state.current_state["files"]. A module-level FILE_CACHE (protected by FILE_CACHE_LOCK) ensures each file path is uploaded only once, even across multiple solver instances running in parallel. - Retry logic -- The
_run_assistant_retryingmethod is decorated with@backoff.on_exceptionto retry on transient OpenAI errors (RateLimitError,APIConnectionError,APITimeoutError,InternalServerError) with exponential back-off. - Run lifecycle -- After submitting messages, the solver creates a Run and polls its status via
_wait_on_run(500ms intervals) until it reaches a terminal state. If the run does not complete successfully, anOpenAIErroris raised to trigger retry. - Message conversion -- The Assistants API only accepts
user-role messages. Non-user messages (e.g. system) are converted by prepending the role as a tag:[system] ....
Usage
Import OpenAIAssistantsSolver when evaluating tasks that benefit from stateful multi-turn interactions, code execution, or file retrieval. It is typically specified by class path in YAML eval configurations. Requires the OPENAI_API_KEY environment variable and the openai package.
Code Reference
Source Location
- Repository: Openai_Evals
- File: evals/solvers/providers/openai/openai_assistants_solver.py
- Lines: 1-272
Signature
class OpenAIAssistantsSolver(Solver):
def __init__(
self,
model: str,
name: Optional[str] = None,
description: Optional[str] = None,
tools: list[Dict[str, Any]] = [],
file_paths: list[str] = [],
assistant: Optional[Assistant] = None,
thread: Optional[Thread] = None,
postprocessors: list[str] = [],
registry: Any = None,
):
def _run_assistant_retrying(self, task_state: TaskState):
def _solve(self, task_state: TaskState, **kwargs) -> SolverResult:
def copy(self) -> "OpenAIAssistantsSolver":
def _create_file(self, file_path: str) -> str:
def _create_files(self, file_paths: list[str]) -> list[str]:
@property
def name(self) -> str:
@property
def model_version(self) -> Union[str, dict]:
Import
from evals.solvers.providers.openai.openai_assistants_solver import OpenAIAssistantsSolver
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| model | str |
Yes | OpenAI model identifier (e.g. "gpt-4-1106-preview").
|
| name | Optional[str] |
No (default None) |
Human-readable name for the Assistant. |
| description | Optional[str] |
No (default None) |
Description of the Assistant's purpose. |
| tools | list[Dict[str, Any]] |
No (default []) |
Tool specifications (e.g. [{"type": "code_interpreter"}]).
|
| file_paths | list[str] |
No (default []) |
Local file paths to upload and attach to the Assistant (available to all threads). |
| assistant | Optional[Assistant] |
No (default None) |
Pre-existing Assistant object; used internally by copy(). When provided, name, description, tools, and file_paths must not be set.
|
| thread | Optional[Thread] |
No (default None) |
Pre-existing Thread object; if not provided, a new thread is created automatically. |
| postprocessors | list[str] |
No (default []) |
Fully-qualified class paths of PostProcessor instances to apply to the output. |
| registry | Any |
No (default None) |
Unused; accepted for interface compatibility. |
| task_state | TaskState |
Yes (at solve time) | The evaluation task state. The task_description is sent as the Run's instructions. Thread-specific files may be passed via task_state.current_state["files"].
|
Outputs
| Name | Type | Description |
|---|---|---|
| result | SolverResult |
Contains the Assistant's concatenated text response in output. Multiple content blocks (text or image placeholders) are joined with newlines.
|
Usage Examples
from evals.solvers.providers.openai.openai_assistants_solver import OpenAIAssistantsSolver
from evals.task_state import TaskState, Message
# Instantiate with code-interpreter tool
solver = OpenAIAssistantsSolver(
model="gpt-4-1106-preview",
name="Math Evaluator",
description="Solves math problems using code execution.",
tools=[{"type": "code_interpreter"}],
)
# Build a task state
task_state = TaskState(
task_description="Solve the following math problem step by step.",
messages=[
Message(role="user", content="What is the integral of x^2 from 0 to 5?"),
],
)
# Solve the task
result = solver(task_state)
print(result.output)
# Copy solver for parallel evaluation (new Thread, same Assistant)
solver_copy = solver.copy()
# solver_copy.assistant.id == solver.assistant.id (same assistant)
# solver_copy.thread.id != solver.thread.id (different thread)
# Using file attachments for retrieval
solver_with_files = OpenAIAssistantsSolver(
model="gpt-4-1106-preview",
tools=[{"type": "retrieval"}],
file_paths=["./data/reference_doc.pdf"],
)