Principle:Princeton nlp Tree of thought llm Task Interface Design
| Knowledge Sources | |
|---|---|
| Domains | Software_Design, NLP |
| Last Updated | 2026-02-14 03:30 GMT |
Overview
An abstract base class pattern that defines the contract for benchmark tasks in the Tree of Thoughts framework, ensuring all tasks are interchangeable with the search algorithm.
Description
Task Interface Design uses the Template Method pattern to define a uniform interface that all benchmark tasks must implement. The Task base class declares the methods that the BFS search loop calls, while leaving the implementation details to each concrete subclass. This separation enables the search algorithm to operate generically across different problem domains without knowing their specifics.
The required interface consists of:
- __init__(): Load data, set self.steps (BFS depth) and self.stops (stop tokens per step).
- __len__(): Return the number of puzzles in the dataset.
- get_input(idx): Return the string representation of puzzle idx.
- test_output(idx, output): Validate a candidate solution against ground truth.
- Prompt wrap methods: Task-specific methods that format input/partial solution into LLM prompts, chosen based on the generation/evaluation strategy.
Usage
Use this principle when adding a new benchmark task to the framework. The new task must subclass Task and implement all required methods to be compatible with solve(), naive_solve(), and the experiment loop.
Theoretical Basis
The Template Method pattern defines the skeleton of an algorithm in a base class, deferring some steps to subclasses:
# Abstract pattern
class Task:
def __init__(self):
self.steps = 0 # BFS depth (set by subclass)
self.stops = [] # Stop tokens per step (set by subclass)
def __len__(self) -> int:
raise NotImplementedError
def get_input(self, idx: int) -> str:
raise NotImplementedError
def test_output(self, idx: int, output: str):
raise NotImplementedError
# Strategy-specific prompt methods (subset required per task):
# standard_prompt_wrap(x, y) -> str
# cot_prompt_wrap(x, y) -> str
# propose_prompt_wrap(x, y) -> str
# value_prompt_wrap(x, y) -> str
# value_outputs_unwrap(x, y, outputs) -> float
# vote_prompt_wrap(x, ys) -> str
# vote_outputs_unwrap(outputs, n) -> list
The strategy-specific methods divide into two patterns:
- Propose + Value: Used by Game of 24 and Crosswords. Task provides structured proposals and independent value scoring.
- Sample + Vote: Used by Creative Writing. Task provides standard/CoT prompts and comparative voting.