Implementation:Princeton nlp Tree of thought llm Get Values
| Knowledge Sources | |
|---|---|
| Domains | LLM_Reasoning, Search_Algorithms, NLP |
| Last Updated | 2026-02-14 03:30 GMT |
Overview
Concrete tool for evaluating candidate thoughts using LLM-based value scoring or voting provided by the Tree of Thoughts BFS module.
Description
The get_values function evaluates a list of candidates by independently scoring each one via the task's value prompt and unwrap methods. It uses a local deduplication cache to assign score 0 to duplicate candidates and delegates to get_value for individual evaluation, which itself caches results in task.value_cache to avoid redundant LLM calls.
The companion get_votes function implements the voting strategy by presenting all candidates simultaneously and counting LLM votes.
Usage
Used within solve() when args.method_evaluate == 'value' (get_values) or args.method_evaluate == 'vote' (get_votes). The Game of 24 task uses value evaluation; the Creative Writing task uses vote evaluation.
Code Reference
Source Location
- Repository: tree-of-thought-llm
- File: src/tot/methods/bfs.py
- Lines: 6-26 (get_value, get_values), 28-32 (get_votes)
Signature
def get_value(task, x, y, n_evaluate_sample, cache_value=True):
"""
Evaluate a single candidate thought.
Args:
task: Task object with value_prompt_wrap() and value_outputs_unwrap().
x (str): Original problem input.
y (str): Candidate partial solution to evaluate.
n_evaluate_sample (int): Number of LLM evaluation samples.
cache_value (bool): Whether to use task.value_cache (default True).
Returns:
float: Numeric score for the candidate.
"""
def get_values(task, x, ys, n_evaluate_sample, cache_value=True):
"""
Evaluate a list of candidate thoughts.
Args:
task: Task object.
x (str): Original problem input.
ys (list[str]): List of candidate partial solutions.
n_evaluate_sample (int): Number of evaluation samples per candidate.
cache_value (bool): Whether to cache values (default True).
Returns:
list[float]: Scores for each candidate. Duplicates score 0.
"""
def get_votes(task, x, ys, n_evaluate_sample):
"""
Evaluate candidates by LLM voting.
Args:
task: Task with vote_prompt_wrap() and vote_outputs_unwrap().
x (str): Original problem input.
ys (list[str]): Candidate solutions.
n_evaluate_sample (int): Number of voting rounds.
Returns:
list[int]: Vote counts per candidate.
"""
Import
from tot.methods.bfs import get_values, get_votes
I/O Contract
Inputs (get_values)
| Name | Type | Required | Description |
|---|---|---|---|
| task | Task | Yes | Task object with value_prompt_wrap, value_outputs_unwrap, value_cache |
| x | str | Yes | Original problem input |
| ys | list[str] | Yes | List of candidate partial solutions to evaluate |
| n_evaluate_sample | int | Yes | Number of LLM evaluation calls per candidate |
| cache_value | bool | No | Whether to use caching (default True) |
Inputs (get_votes)
| Name | Type | Required | Description |
|---|---|---|---|
| task | Task | Yes | Task object with vote_prompt_wrap, vote_outputs_unwrap |
| x | str | Yes | Original problem input |
| ys | list[str] | Yes | Candidate solutions to vote on |
| n_evaluate_sample | int | Yes | Number of voting rounds |
Outputs
| Name | Type | Description |
|---|---|---|
| get_values return | list[float] | Numeric scores per candidate (higher is better) |
| get_votes return | list[int] | Vote counts per candidate (higher is better) |
Usage Examples
Value Evaluation (Game of 24)
from tot.tasks import get_task
from tot.methods.bfs import get_values
task = get_task('game24')
x = task.get_input(900)
# Candidates after one step of generation
ys = [
"1 + 2 = 3 (left: 3 3 4)\n",
"1 * 4 = 4 (left: 2 3 4)\n",
"3 - 1 = 2 (left: 2 2 4)\n",
]
# Evaluate each candidate with 3 LLM calls
values = get_values(task, x, ys, n_evaluate_sample=3)
# values might be [20.003, 1.002, 0.003]
# (sure=20 for first, likely=1 for second, impossible=0.001 for third)
Vote Evaluation (Creative Writing)
from tot.tasks import get_task
from tot.methods.bfs import get_votes
task = get_task('text')
x = task.get_input(0)
ys = ["Plan:\nFirst passage...", "Plan:\nSecond passage...", "Plan:\nThird passage..."]
votes = get_votes(task, x, ys, n_evaluate_sample=5)
# votes might be [3, 1, 1] — first candidate received 3 of 5 votes