Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Openai Evals HumanCliSolver

From Leeroopedia
Revision as of 13:34, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Openai_Evals_HumanCliSolver.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Knowledge Sources
Domains Evaluation, Solvers
Last Updated 2026-02-14 10:00 GMT

Overview

Concrete tool for interactive human-in-the-loop evaluation provided by the evals library.

Description

HumanCliSolver is a subclass of Solver that enables a human evaluator to act as the solver by reading prompts printed to the command line and typing responses via standard input. When _solve is called, it concatenates the system task description and all conversation messages into a formatted prompt string, prints it to the terminal, and waits for the human to type an answer. The answer is recorded via record_sampling with the model name set to "human".

This solver is designed exclusively for single-threaded execution. Because it reads from stdin, running multiple evaluation threads simultaneously would cause prompt and input text from different threads to interleave unpredictably. The environment variable EVALS_SEQUENTIAL=1 must be set to ensure correct operation.

Usage

Import HumanCliSolver when you need to manually evaluate samples by having a human provide answers through the command line. This is useful for establishing human baselines on evaluation tasks, debugging prompts, or spot-checking eval samples interactively.

Code Reference

Source Location

Signature

class HumanCliSolver(Solver):
    def __init__(
        self,
        input_prompt: str = "assistant (you): ",
        postprocessors: list[str] = [],
        registry: Any = None,
    ):
        ...

    def _solve(self, task_state: TaskState, **kwargs) -> SolverResult:
        ...

    @property
    def name(self) -> str:
        # returns "human"
        ...

Import

from evals.solvers.human_cli_solver import HumanCliSolver

I/O Contract

Inputs

Name Type Required Description
input_prompt str No Prompt string displayed before the human types their answer. Defaults to "assistant (you): ".
postprocessors list[str] No List of postprocessor names to apply to solver output. Defaults to an empty list.
registry Any No Registry object for resource lookup. Not used directly by this solver.
task_state TaskState Yes The current evaluation task state containing the task description and message history, passed to _solve.

Outputs

Name Type Description
SolverResult SolverResult Contains the human-typed answer string as the output field.
name str Always returns "human" to identify this solver in logs and records.

Usage Examples

from evals.solvers.human_cli_solver import HumanCliSolver

# Create solver with default prompt
solver = HumanCliSolver()

# Create solver with a custom input prompt
solver = HumanCliSolver(input_prompt="Your answer: ")

# Use in an eval run (requires EVALS_SEQUENTIAL=1)
# The solver will print the conversation and wait for typed input
result = solver._solve(task_state)
print(result.output)  # whatever the human typed

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment