Implementation:Openai Evals CoTSolver

Knowledge Sources	Openai_Evals
Domains	Evaluation, Solvers
Last Updated	2026-02-14 10:00 GMT

Overview

Concrete tool for chain-of-thought reasoning with explicit answer extraction provided by the evals library.

Description

CoTSolver is a subclass of NestedSolver that implements chain-of-thought prompting in two stages. First, a cot_solver is invoked with a reasoning prompt (the cot_template) appended to the conversation, generating a step-by-step reasoning trace. Second, an extract_solver is invoked with the reasoning output appended to the conversation along with an extraction prompt (the extract_template), which distills the reasoning into a final answer.

The solver supports persistent memory across multi-turn evaluations through an internal PersistentMemoryCache. When persistent_memory is enabled (the default), the cache preserves the private intermediate reasoning messages (the CoT prompt, the reasoning output, and the extraction prompt) so that in subsequent turns the solver retains its chain-of-thought context even though the eval harness strips those messages.

The cot_template and extract_template methods can be overridden by subclasses that need to vary the prompt based on the current TaskState, enabling dynamic prompt selection depending on the nature of the task.

Usage

Import CoTSolver when you need a solver that explicitly reasons before answering. This is suitable for tasks that benefit from step-by-step reasoning, such as math problems, logic puzzles, or complex question answering. Configure it with two nested solver specifications: one for generating the reasoning (cot_solver) and one for extracting the final answer (extract_solver).

Code Reference

Source Location

Repository: Openai_Evals
File: evals/solvers/nested/cot_solver.py
Lines: 1-85

Signature

class CoTSolver(NestedSolver):
    def __init__(
        self,
        cot_solver: SolverSpec,
        extract_solver: SolverSpec,
        cot_template: str = DEFAULT_COT_TEMPLATE,
        extract_template: str = DEFAULT_EXTRACT_ANSWER_TEMPLATE,
        persistent_memory: bool = True,
        private_interaction_length: int = 3,
        postprocessors: list[str] = [],
        registry: Any = None,
    ):
        ...

    @property
    def cot_solver(self) -> Solver:
        ...

    @property
    def extract_solver(self) -> Solver:
        ...

    def cot_template(self, task_state: TaskState) -> str:
        ...

    def extract_template(self, task_state: TaskState) -> str:
        ...

    def _solve(self, task_state: TaskState, **kwargs) -> SolverResult:
        ...

    @property
    def name(self) -> str:
        # returns "CoT_{cot_solver.name}_{extract_solver.name}"
        ...

Import

from evals.solvers.nested.cot_solver import CoTSolver

I/O Contract

Inputs

Name	Type	Required	Description
cot_solver	SolverSpec	Yes	Specification for the nested solver that generates the chain-of-thought reasoning.
extract_solver	SolverSpec	Yes	Specification for the nested solver that extracts a final answer from the reasoning output.
cot_template	str	No	Template string appended as a system message to prompt reasoning. Defaults to DEFAULT_COT_TEMPLATE.
extract_template	str	No	Template string appended as a system message to prompt answer extraction. Defaults to DEFAULT_EXTRACT_ANSWER_TEMPLATE.
persistent_memory	bool	No	Whether to maintain private reasoning messages across turns. Defaults to True.
private_interaction_length	int	No	Number of private messages to cache per turn. Defaults to 3.
postprocessors	list[str]	No	List of postprocessor names to apply to solver output. Defaults to an empty list.
registry	Any	No	Registry object for resource lookup.
task_state	TaskState	Yes	The current evaluation task state, passed to _solve.

Outputs

Name	Type	Description
SolverResult	SolverResult	Contains output (the extracted final answer) and reasoning_output (the full chain-of-thought reasoning text).
name	str	Returns "CoT_{cot_solver.name}_{extract_solver.name}" identifying both nested solvers.

Usage Examples

from evals.solvers.nested.cot_solver import CoTSolver
from evals.solvers.solver import SolverSpec

# Define CoTSolver via YAML-style config (typical usage)
# solver:
#   class: evals.solvers.nested.cot_solver:CoTSolver
#   args:
#     cot_solver:
#       class: evals.solvers.openai_solver:OpenAISolver
#       args:
#         model: gpt-4
#     extract_solver:
#       class: evals.solvers.openai_solver:OpenAISolver
#       args:
#         model: gpt-4

# Programmatic usage
cot_spec = SolverSpec(class_name="evals.solvers.openai_solver:OpenAISolver", args={"model": "gpt-4"})
extract_spec = SolverSpec(class_name="evals.solvers.openai_solver:OpenAISolver", args={"model": "gpt-4"})

solver = CoTSolver(
    cot_solver=cot_spec,
    extract_solver=extract_spec,
    persistent_memory=True,
)

result = solver(task_state)
print(result.output)            # final extracted answer
print(result.reasoning_output)  # full chain-of-thought reasoning

Related Pages

Environment:Openai_Evals_Python_Runtime

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment