Heuristic:SqueezeAILab ETS Max Depth And Token Guards
| Knowledge Sources | |
|---|---|
| Domains | Optimization, Debugging |
| Last Updated | 2026-02-14 02:30 GMT |
Overview
Guard against infinite tree expansion by enforcing a hard depth limit of 25 and a configurable per-trajectory token budget (`max_tokens: 1024`), pruning nodes that exceed either limit.
Description
The ETS tree search loop has two termination guards that prevent runaway expansion. First, a hard depth limit of 25 levels is enforced in the search loop — if the tree reaches depth 25, the search terminates regardless of remaining budget. Second, a cumulative token limit (`max_tokens`) is tracked per-trajectory; nodes whose cumulative tokens exceed this threshold are treated as terminal (width is decremented, no further expansion). Together, these guards ensure the search always terminates within bounded compute, even if the model never produces a terminal answer phrase.
Usage
These guards are always active and require no configuration for the depth limit (hardcoded at 25). The token limit is configurable via `max_tokens` in the YAML config (default: 1024). Increase `max_tokens` for problems requiring longer reasoning chains; decrease it to save compute on simpler tasks.
The Insight (Rule of Thumb)
- Action: Keep the hardcoded depth guard at 25 (no change needed). Set `max_tokens: 1024` in the YAML config for standard math reasoning.
- Value: Depth limit = 25, token limit = 1024, step token limit = 256.
- Trade-off: Higher `max_tokens` allows longer reasoning chains but increases compute cost and KV cache memory. The depth limit of 25 is generous for math problems (most solutions complete in 5-10 steps) and mainly serves as a safety net.
Reasoning
Without these guards, a tree search could expand indefinitely if:
- The model generates steps that never contain terminal phrases ("The answer is", "The final answer is:", "Therefore, the final answer is").
- The reward model assigns high scores to non-terminal nodes, causing the selection algorithm to keep expanding them.
The `max_step_tokens=256` per step combined with `max_tokens=1024` means a trajectory can have at most ~4 full-length steps before hitting the token budget. In practice, most math reasoning steps are shorter, allowing 6-10 steps per trajectory.
The depth limit of 25 is a separate safety net that catches edge cases where many very short steps accumulate without hitting the token limit.
Leaf detection uses string matching for answer phrases (`rebase.py:60`), which is model-type-specific. The depth and token guards provide model-agnostic termination guarantees.
Code Evidence
Hard depth limit from `rebase.py:662-663`:
if depth >= 25:
break
Token budget check in select_and_expand from `rebase.py:608`:
if node.is_leaf() == True or node.get_cum_tokens() >= self.paras["max_tokens"]:
self.remaining_width -= 1
Cumulative token tracking in TreeNode from `rebase.py:62-67`:
if parent is not None:
self.depth = parent.get_depth() + 1
self.cum_tokens = parent.get_cum_tokens() + num_step_tokens
else:
self.depth = 0
self.cum_tokens = num_step_tokens
Leaf detection via answer phrases from `rebase.py:60`:
if "The answer is" in self.text_ or "The final answer is:" in self.text_ or "Therefore, the final answer is" in self.text_[len(parent.get_text()):]:
self.leaf_ = True
YAML config values from `hype-parameters/ets_16_math500.yaml:2-3`:
max_step_tokens: 256
max_tokens: 1024