Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Princeton nlp Tree of thought llm DFS Crosswords Search

From Leeroopedia
Knowledge Sources
Domains Search_Algorithms, LLM_Reasoning, Crosswords
Last Updated 2026-02-14 14:00 GMT

Overview

Concrete tool for depth-first tree search over LLM-generated crossword puzzle actions, producing JSON trajectory logs with board state and confidence tracking.

Description

The dfs() function implements the DFS variant of Tree of Thoughts for the mini crosswords task. Defined in the Jupyter notebook search_crosswords-dfs.ipynb, it recursively explores candidate word placements ranked by LLM-assigned confidence scores. At each board state, it calls get_candidates_to_scores() to obtain GPT-4 proposals with parsed confidence levels (certain, high, medium, low), then tries each candidate in descending score order up to a per-state limit. The function records every step as a structured info dict and appends it to an infos list that is serialized to logs/crosswords/infoss_dfs_no_prune.json (without pruning) or logs/crosswords/infoss_dfs_prune.json (with pruning).

Usage

Use this implementation when running DFS-based Tree of Thoughts experiments on mini crossword puzzles. The notebook executes DFS for puzzle indices 0 to 100 (step 5) with a time limit of 100 steps and max 3 candidates per state. The resulting JSON log is the primary experimental artifact for the crosswords DFS results reported in the paper.

Code Reference

Source Location

Signature

def dfs(env, actions, infos, time_limit, prune, max_per_state):
    """
    Recursive depth-first search over crossword puzzle actions.

    Args:
        env: MiniCrosswordsEnv instance with current board state.
        actions: List[str] accumulator of actions taken on current path.
        infos: List[dict] accumulator of step info dicts (mutated in place).
        time_limit: int maximum total steps across all branches.
        prune: bool whether to prune branches with impossible constraints.
        max_per_state: int maximum candidates to try per board state.
    """
def get_candidates_to_scores(env):
    """
    Query GPT-4 for candidate word placements and aggregate confidence scores.

    Args:
        env: MiniCrosswordsEnv instance.

    Returns:
        dict mapping action strings (e.g., 'h2. motor') to float scores.
    """

Import

# Notebook-only: not importable as a module.
# Functions are defined inline in scripts/crosswords/search_crosswords-dfs.ipynb
from tot.tasks.crosswords import MiniCrosswordsEnv
from tot.prompts.crosswords import propose_prompt, value_prompt
from tot.models import gpt

I/O Contract

Inputs

Name Type Required Description
env MiniCrosswordsEnv Yes Crossword environment with board state, step, and cache
actions List[str] Yes Accumulator list for current DFS path actions
infos List[dict] Yes Accumulator list for all step info records (mutated)
time_limit int Yes Maximum total steps across all DFS branches (default: 100)
prune bool Yes If True, prune branches where any constraint is impossible
max_per_state int Yes Maximum candidates to expand per board state (default: 3)

Outputs

Name Type Description
infos (mutated) List[dict] Each dict contains: total_step, env_step, actions, info (r_letter, r_word), count (sure, maybe, impossible)
JSON log file File Written to logs/crosswords/infoss_dfs_no_prune.json or infoss_dfs_prune.json

Output JSON Schema

[
    [
        {
            "total_step": 0,
            "env_step": 1,
            "actions": ["h2. motor"],
            "info": {"r_letter": 0.2, "r_word": 0.1},
            "count": {"sure": 1, "maybe": 0, "impossible": 0}
        }
    ]
]
  • total_step: Global step counter across all branches for this puzzle
  • env_step: Number of words placed on the current board
  • actions: Ordered list of placements (e.g., "h2. motor" = horizontal slot 2, word "motor")
  • info.r_letter: Fraction of correctly placed letters
  • info.r_word: Fraction of correctly completed words
  • count.sure/maybe/impossible: Constraint satisfaction counts from env.prompt_status()

Usage Examples

Running DFS Without Pruning

import json
from tot.tasks.crosswords import MiniCrosswordsEnv

env = MiniCrosswordsEnv()

# Run DFS on puzzles 0, 5, 10, ..., 95
infoss = []
for i in range(0, 100, 5):
    env.reset(i)
    infos = []
    actions = []
    dfs(env, actions, infos, time_limit=100, prune=False, max_per_state=3)
    infoss.append(infos)

# Save results
with open('logs/crosswords/infoss_dfs_no_prune.json', 'w') as fout:
    json.dump(infoss, fout)

Running DFS With Pruning

# With pruning: branches with impossible constraints are cut
infoss = []
for i in range(0, 100, 5):
    env.reset(i)
    infos = []
    actions = []
    dfs(env, actions, infos, time_limit=100, prune=True, max_per_state=3)
    infoss.append(infos)

with open('logs/crosswords/infoss_dfs_prune.json', 'w') as fout:
    json.dump(infoss, fout)

Analyzing DFS Results

import json

with open('logs/crosswords/infoss_dfs_no_prune.json') as f:
    infoss = json.load(f)

# Each element is a list of DFS step records for one puzzle
for puzzle_idx, infos in enumerate(infoss):
    if infos:
        best = max(infos, key=lambda x: x['info']['r_word'])
        print(f"Puzzle {puzzle_idx * 5}: best r_word={best['info']['r_word']}, "
              f"steps={len(infos)}, words placed={best['env_step']}")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment