Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Openai Evals Solver Base Class

From Leeroopedia
Knowledge Sources
Domains Evaluation, Software_Architecture
Last Updated 2026-02-14 10:00 GMT

Overview

Concrete abstract base class for implementing stateful model solvers with postprocessing support provided by the evals solver module.

Description

The Solver class is the abstract base for stateful model integrations. It inherits from both ABC and CompletionFn for compatibility. Subclasses must implement _solve(task_state, **kwargs) -> SolverResult. The __call__ method deep-copies the task_state before passing it to _solve and applies any configured postprocessors to the output. NestedSolver extends this for composed behaviors. Provider implementations exist for OpenAI, Anthropic, Google Gemini, and Together AI.

Usage

Subclass Solver when creating a new model integration that requires statefulness. Override _solve to implement the model interaction logic. Configure via YAML in evals/registry/solvers/.

Code Reference

Source Location

  • Repository: openai/evals
  • File: evals/solvers/solver.py (lines 41-126)

Signature

class Solver(ABC, CompletionFn):
    def __init__(
        self,
        postprocessors: list[str] = [],
        registry: Any = None,
    ) -> None:
        """
        Args:
            postprocessors: List of fully-qualified postprocessor class paths.
            registry: Optional Registry instance.
        """

    @abstractmethod
    def _solve(
        self,
        task_state: TaskState,
        **kwargs,
    ) -> SolverResult:
        """
        Implement model interaction logic.

        Args:
            task_state: TaskState with task_description, messages, current_state.
            **kwargs: Additional arguments.

        Returns:
            SolverResult with output string and optional metadata.
        """

    def __call__(self, task_state: TaskState, **kwargs) -> SolverResult:
        """Deep-copies task_state, calls _solve, applies postprocessors."""

    def copy(self) -> "Solver":
        """Create a deep copy of this solver for per-sample isolation."""

class SolverResult:
    def __init__(self, output: str, **metadata):
        """
        Args:
            output: The solver's text output.
            **metadata: Arbitrary metadata key-value pairs.
        """

    @property
    def output(self) -> str: ...

    @property
    def metadata(self) -> dict: ...

Import

from evals.solvers.solver import Solver, SolverResult, NestedSolver, DummySolver
from evals.task_state import TaskState, Message

I/O Contract

Inputs

Name Type Required Description
task_state TaskState Yes Contains task_description (str), messages (list[Message]), current_state (Any)
postprocessors list[str] No Postprocessor class paths applied to output
**kwargs Any No Additional solver-specific arguments

Outputs

Name Type Description
SolverResult SolverResult Contains output (str) and metadata (dict)

Usage Examples

Implementing a Custom Solver

from evals.solvers.solver import Solver, SolverResult
from evals.task_state import TaskState

class MyCustomSolver(Solver):
    def __init__(self, model_name: str, **kwargs):
        super().__init__(**kwargs)
        self.model_name = model_name

    def _solve(self, task_state: TaskState, **kwargs) -> SolverResult:
        # Build prompt from task state
        prompt = task_state.task_description
        for msg in task_state.messages:
            prompt += f"\n{msg.role}: {msg.content}"

        # Call your model
        response = my_model_api(prompt, model=self.model_name)

        return SolverResult(
            output=response,
            model=self.model_name,
        )

YAML Registration

# In evals/registry/solvers/my_solver.yaml
my-solver:
  class: my_package.solver:MyCustomSolver
  args:
    model_name: "my-model-v1"
    postprocessors: []

Related Pages

Implements Principle

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment