Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Princeton nlp Tree of thought llm Task Interface Design

From Leeroopedia
Knowledge Sources
Domains Software_Design, NLP
Last Updated 2026-02-14 03:30 GMT

Overview

An abstract base class pattern that defines the contract for benchmark tasks in the Tree of Thoughts framework, ensuring all tasks are interchangeable with the search algorithm.

Description

Task Interface Design uses the Template Method pattern to define a uniform interface that all benchmark tasks must implement. The Task base class declares the methods that the BFS search loop calls, while leaving the implementation details to each concrete subclass. This separation enables the search algorithm to operate generically across different problem domains without knowing their specifics.

The required interface consists of:

  • __init__(): Load data, set self.steps (BFS depth) and self.stops (stop tokens per step).
  • __len__(): Return the number of puzzles in the dataset.
  • get_input(idx): Return the string representation of puzzle idx.
  • test_output(idx, output): Validate a candidate solution against ground truth.
  • Prompt wrap methods: Task-specific methods that format input/partial solution into LLM prompts, chosen based on the generation/evaluation strategy.

Usage

Use this principle when adding a new benchmark task to the framework. The new task must subclass Task and implement all required methods to be compatible with solve(), naive_solve(), and the experiment loop.

Theoretical Basis

The Template Method pattern defines the skeleton of an algorithm in a base class, deferring some steps to subclasses:

# Abstract pattern
class Task:
    def __init__(self):
        self.steps = 0     # BFS depth (set by subclass)
        self.stops = []    # Stop tokens per step (set by subclass)

    def __len__(self) -> int:
        raise NotImplementedError

    def get_input(self, idx: int) -> str:
        raise NotImplementedError

    def test_output(self, idx: int, output: str):
        raise NotImplementedError

    # Strategy-specific prompt methods (subset required per task):
    # standard_prompt_wrap(x, y) -> str
    # cot_prompt_wrap(x, y) -> str
    # propose_prompt_wrap(x, y) -> str
    # value_prompt_wrap(x, y) -> str
    # value_outputs_unwrap(x, y, outputs) -> float
    # vote_prompt_wrap(x, ys) -> str
    # vote_outputs_unwrap(outputs, n) -> list

The strategy-specific methods divide into two patterns:

  • Propose + Value: Used by Game of 24 and Crosswords. Task provides structured proposals and independent value scoring.
  • Sample + Vote: Used by Creative Writing. Task provides standard/CoT prompts and comparative voting.

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment