Implementation:Iamhankai Forest of Thought Monte Carlo Tree
| Knowledge Sources | |
|---|---|
| Domains | Search_Algorithms, Reasoning |
| Last Updated | 2026-02-14 03:00 GMT |
Overview
Concrete tool for executing single-tree MCTS reasoning provided by the Forest-of-Thought repository.
Description
The monte_carlo_tree function implements the core MCTS loop for a single reasoning tree. It generates an initial answer, then iteratively refines it through UCB-guided selection, hint-based expansion, reward scoring, and backpropagation. The function maintains full tree state including reward banks, UCB values, and activation signals for tracking which answers are "mature" enough for forest-level aggregation.
Usage
Called by Monte_Carlo_Forest.mctsr_run() once per tree in the forest. Each tree operates independently with the same question but may produce different answers due to stochastic LLM generation.
Code Reference
Source Location
- Repository: Forest-of-Thought
- File: run_with_mcf_stop_noearly.py
- Lines: L223-312
Signature
def monte_carlo_tree(DATA_NAME, query, max_iter=16, ans_format=''):
"""
Execute MCTS for iterative answer refinement on a single problem.
Args:
DATA_NAME (str): Dataset identifier for label extraction.
query (str): The input question to solve.
max_iter (int): Maximum MCTS iterations (default 16).
ans_format (str): Expected answer format specification.
Returns:
tuple: (
tree_ans (str): Best selected answer,
is_activate (bool): Whether answer is mature,
activated_answers_list (list[str]): Extracted answer labels,
activated_answer_scores (list[float]): Reward scores,
answers_list (list[str]): All generated raw answers,
to_explore (list): Explorable nodes,
to_explore_reward (dict): Node-to-reward mapping,
ucb_bank (dict): UCB values for all nodes
)
"""
Import
# Defined inline in main script
from run_with_mcf_stop_noearly import monte_carlo_tree
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| DATA_NAME | str | Yes | Dataset name for format-aware label extraction |
| query | str | Yes | Math problem to solve via MCTS |
| max_iter | int | No | Number of MCTS iterations (default 16) |
| ans_format | str | No | Expected answer format hint |
Outputs
| Name | Type | Description |
|---|---|---|
| tree_ans | str | Best answer selected by the tree |
| is_activate | bool | True if the tree produced a mature answer |
| activated_answers_list | list[str] | Extracted answer labels from all iterations |
| activated_answer_scores | list[float] | Reward scores for each answer |
| answers_list | list[str] | All raw generated answer strings |
| to_explore | list | Remaining explorable nodes |
| to_explore_reward | dict | Reward mapping for explorable nodes |
| ucb_bank | dict | UCB values for all explored nodes |
Usage Examples
# Single tree MCTS execution (called internally by Monte_Carlo_Forest)
tree_ans, is_activate, act_answers, act_scores, all_answers, \
to_explore, rewards, ucb = monte_carlo_tree(
DATA_NAME="math",
query="Find all real numbers x such that x^2 - 5x + 6 = 0",
max_iter=8,
ans_format="\\boxed{}"
)
print(f"Best answer: {tree_ans}")
print(f"Activated: {is_activate}")
print(f"Answer labels: {act_answers}")