Implementation:Openai Evals CoTSolver
| Knowledge Sources | |
|---|---|
| Domains | Evaluation, Solvers |
| Last Updated | 2026-02-14 10:00 GMT |
Overview
Concrete tool for chain-of-thought reasoning with explicit answer extraction provided by the evals library.
Description
CoTSolver is a subclass of NestedSolver that implements chain-of-thought prompting in two stages. First, a cot_solver is invoked with a reasoning prompt (the cot_template) appended to the conversation, generating a step-by-step reasoning trace. Second, an extract_solver is invoked with the reasoning output appended to the conversation along with an extraction prompt (the extract_template), which distills the reasoning into a final answer.
The solver supports persistent memory across multi-turn evaluations through an internal PersistentMemoryCache. When persistent_memory is enabled (the default), the cache preserves the private intermediate reasoning messages (the CoT prompt, the reasoning output, and the extraction prompt) so that in subsequent turns the solver retains its chain-of-thought context even though the eval harness strips those messages.
The cot_template and extract_template methods can be overridden by subclasses that need to vary the prompt based on the current TaskState, enabling dynamic prompt selection depending on the nature of the task.
Usage
Import CoTSolver when you need a solver that explicitly reasons before answering. This is suitable for tasks that benefit from step-by-step reasoning, such as math problems, logic puzzles, or complex question answering. Configure it with two nested solver specifications: one for generating the reasoning (cot_solver) and one for extracting the final answer (extract_solver).
Code Reference
Source Location
- Repository: Openai_Evals
- File: evals/solvers/nested/cot_solver.py
- Lines: 1-85
Signature
class CoTSolver(NestedSolver):
def __init__(
self,
cot_solver: SolverSpec,
extract_solver: SolverSpec,
cot_template: str = DEFAULT_COT_TEMPLATE,
extract_template: str = DEFAULT_EXTRACT_ANSWER_TEMPLATE,
persistent_memory: bool = True,
private_interaction_length: int = 3,
postprocessors: list[str] = [],
registry: Any = None,
):
...
@property
def cot_solver(self) -> Solver:
...
@property
def extract_solver(self) -> Solver:
...
def cot_template(self, task_state: TaskState) -> str:
...
def extract_template(self, task_state: TaskState) -> str:
...
def _solve(self, task_state: TaskState, **kwargs) -> SolverResult:
...
@property
def name(self) -> str:
# returns "CoT_{cot_solver.name}_{extract_solver.name}"
...
Import
from evals.solvers.nested.cot_solver import CoTSolver
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| cot_solver | SolverSpec | Yes | Specification for the nested solver that generates the chain-of-thought reasoning. |
| extract_solver | SolverSpec | Yes | Specification for the nested solver that extracts a final answer from the reasoning output. |
| cot_template | str | No | Template string appended as a system message to prompt reasoning. Defaults to DEFAULT_COT_TEMPLATE. |
| extract_template | str | No | Template string appended as a system message to prompt answer extraction. Defaults to DEFAULT_EXTRACT_ANSWER_TEMPLATE. |
| persistent_memory | bool | No | Whether to maintain private reasoning messages across turns. Defaults to True. |
| private_interaction_length | int | No | Number of private messages to cache per turn. Defaults to 3. |
| postprocessors | list[str] | No | List of postprocessor names to apply to solver output. Defaults to an empty list. |
| registry | Any | No | Registry object for resource lookup. |
| task_state | TaskState | Yes | The current evaluation task state, passed to _solve. |
Outputs
| Name | Type | Description |
|---|---|---|
| SolverResult | SolverResult | Contains output (the extracted final answer) and reasoning_output (the full chain-of-thought reasoning text). |
| name | str | Returns "CoT_{cot_solver.name}_{extract_solver.name}" identifying both nested solvers. |
Usage Examples
from evals.solvers.nested.cot_solver import CoTSolver
from evals.solvers.solver import SolverSpec
# Define CoTSolver via YAML-style config (typical usage)
# solver:
# class: evals.solvers.nested.cot_solver:CoTSolver
# args:
# cot_solver:
# class: evals.solvers.openai_solver:OpenAISolver
# args:
# model: gpt-4
# extract_solver:
# class: evals.solvers.openai_solver:OpenAISolver
# args:
# model: gpt-4
# Programmatic usage
cot_spec = SolverSpec(class_name="evals.solvers.openai_solver:OpenAISolver", args={"model": "gpt-4"})
extract_spec = SolverSpec(class_name="evals.solvers.openai_solver:OpenAISolver", args={"model": "gpt-4"})
solver = CoTSolver(
cot_solver=cot_spec,
extract_solver=extract_spec,
persistent_memory=True,
)
result = solver(task_state)
print(result.output) # final extracted answer
print(result.reasoning_output) # full chain-of-thought reasoning