Implementation:Openai Evals HumanCliSolver

Knowledge Sources	Openai_Evals
Domains	Evaluation, Solvers
Last Updated	2026-02-14 10:00 GMT

Overview

Concrete tool for interactive human-in-the-loop evaluation provided by the evals library.

Description

HumanCliSolver is a subclass of Solver that enables a human evaluator to act as the solver by reading prompts printed to the command line and typing responses via standard input. When _solve is called, it concatenates the system task description and all conversation messages into a formatted prompt string, prints it to the terminal, and waits for the human to type an answer. The answer is recorded via record_sampling with the model name set to "human".

This solver is designed exclusively for single-threaded execution. Because it reads from stdin, running multiple evaluation threads simultaneously would cause prompt and input text from different threads to interleave unpredictably. The environment variable EVALS_SEQUENTIAL=1 must be set to ensure correct operation.

Usage

Import HumanCliSolver when you need to manually evaluate samples by having a human provide answers through the command line. This is useful for establishing human baselines on evaluation tasks, debugging prompts, or spot-checking eval samples interactively.

Code Reference

Source Location

Repository: Openai_Evals
File: evals/solvers/human_cli_solver.py
Lines: 1-48

Signature

class HumanCliSolver(Solver):
    def __init__(
        self,
        input_prompt: str = "assistant (you): ",
        postprocessors: list[str] = [],
        registry: Any = None,
    ):
        ...

    def _solve(self, task_state: TaskState, **kwargs) -> SolverResult:
        ...

    @property
    def name(self) -> str:
        # returns "human"
        ...

Import

from evals.solvers.human_cli_solver import HumanCliSolver

I/O Contract

Inputs

Name	Type	Required	Description
input_prompt	str	No	Prompt string displayed before the human types their answer. Defaults to "assistant (you): ".
postprocessors	list[str]	No	List of postprocessor names to apply to solver output. Defaults to an empty list.
registry	Any	No	Registry object for resource lookup. Not used directly by this solver.
task_state	TaskState	Yes	The current evaluation task state containing the task description and message history, passed to _solve.

Outputs

Name	Type	Description
SolverResult	SolverResult	Contains the human-typed answer string as the output field.
name	str	Always returns "human" to identify this solver in logs and records.

Usage Examples

from evals.solvers.human_cli_solver import HumanCliSolver

# Create solver with default prompt
solver = HumanCliSolver()

# Create solver with a custom input prompt
solver = HumanCliSolver(input_prompt="Your answer: ")

# Use in an eval run (requires EVALS_SEQUENTIAL=1)
# The solver will print the conversation and wait for typed input
result = solver._solve(task_state)
print(result.output)  # whatever the human typed

Related Pages

Environment:Openai_Evals_Python_Runtime

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment