Principle:SqueezeAILab ETS Result Collection

Knowledge Sources	ETS
Domains	Data_Management, Experiment_Logging
Last Updated	2026-02-14 02:00 GMT

Overview

A data serialization pattern that collects tree search results and statistics into structured JSON output files for downstream evaluation.

Description

After the tree search completes for all questions, the results must be serialized into a standardized format that the evaluation pipeline can consume. This involves two outputs:

answers.json: A JSON array where each element contains the question, all candidate answers with their step-by-step PRM scores, the ground truth answer, and total token count
stats.log: A text log recording aggregate statistics including total KV cache size, number of model calls, number of tokens generated, and wall-clock time

The answer format is designed to support multiple downstream evaluation strategies (best-of-n, majority voting) by preserving all candidates rather than selecting a single answer.

Usage

This step executes automatically at the end of main() in rebase.py, after all questions have been processed by reward_guided_search.run_batch(). The output file path is specified via --output_path.

Theoretical Basis

Preserving all candidate trajectories (rather than selecting the best one) enables post-hoc evaluation with different strategies:

# Abstract result format
results = [
    {
        "id": question_index,
        "question": problem_text,
        "model_answer": [
            {"text": full_trajectory, "step_scores": [s1, s2, ...]},
            ...  # all leaf node trajectories
        ],
        "ground_truth_answer": reference,
        "total_tokens": token_count,
    }
    for each question
]

This decouples the search phase from the evaluation phase, allowing researchers to compare different aggregation and voting strategies without re-running the expensive tree search.

Related Pages

Implemented By

Implementation:SqueezeAILab_ETS_Result_Serialization

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment