Principle:SqueezeAILab ETS Tree Node Expansion

Knowledge Sources	ETS REBASE
Domains	Tree_Search, Inference_Time_Compute
Last Updated	2026-02-14 02:00 GMT

Overview

A tree expansion mechanism that generates multiple child reasoning steps from a parent node by forking the KV cache state and scoring each new step with a Process Reward Model.

Description

Tree node expansion is the growth operation that produces new candidate reasoning steps. Given a selected parent node and a width allocation, the expansion:

Forks the parent's KV cache state into multiple copies (one per child)
Sets the PRM as the score backend for each forked state
Generates the next reasoning step using the policy model with configured stopping conditions
Scores each generated step by performing a PRM forward pass (zero-token generation extracting logits at specific positions)

The stopping condition and scoring mechanism are model-type-specific:

llemma/mistral: Stop at "ки" (Cyrillic step delimiter), score via logits at specific token IDs
llama: Stop at "\n\n", score via logits at position-indexed token IDs

After generation, each forked state is paired with its parent node and added to a running list for subsequent insertion into the tree.

Usage

This expansion occurs at every depth level of the tree search, for every node that is selected for expansion. It is the primary compute-consuming operation as it invokes the policy model for text generation and the reward model for scoring.

Theoretical Basis

Tree expansion in the context of KV cache-aware search is designed to maximize cache reuse:

# Abstract expansion procedure
def expand(parent_node, width):
    forked_states = parent_node.kv_state.fork(width)  # Share prefix KV cache
    for state in forked_states:
        state.score_backend = reward_model
        state += generate(stop="step_delimiter")  # Generate next step
        state += score(forward_only=True)          # Score via PRM
        add_to_tree(state, parent_node)

The KV cache forking operation is efficient because all children share the same prefix computation. Only the newly generated tokens require fresh KV entries, making tree search more memory-efficient than independent sampling of full trajectories.

Related Pages

Implemented By

Implementation:SqueezeAILab_ETS_Tree_Expand

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment