Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:SqueezeAILab ETS Result Collection

From Leeroopedia
Knowledge Sources
Domains Data_Management, Experiment_Logging
Last Updated 2026-02-14 02:00 GMT

Overview

A data serialization pattern that collects tree search results and statistics into structured JSON output files for downstream evaluation.

Description

After the tree search completes for all questions, the results must be serialized into a standardized format that the evaluation pipeline can consume. This involves two outputs:

  • answers.json: A JSON array where each element contains the question, all candidate answers with their step-by-step PRM scores, the ground truth answer, and total token count
  • stats.log: A text log recording aggregate statistics including total KV cache size, number of model calls, number of tokens generated, and wall-clock time

The answer format is designed to support multiple downstream evaluation strategies (best-of-n, majority voting) by preserving all candidates rather than selecting a single answer.

Usage

This step executes automatically at the end of main() in rebase.py, after all questions have been processed by reward_guided_search.run_batch(). The output file path is specified via --output_path.

Theoretical Basis

Preserving all candidate trajectories (rather than selecting the best one) enables post-hoc evaluation with different strategies:

# Abstract result format
results = [
    {
        "id": question_index,
        "question": problem_text,
        "model_answer": [
            {"text": full_trajectory, "step_scores": [s1, s2, ...]},
            ...  # all leaf node trajectories
        ],
        "ground_truth_answer": reference,
        "total_tokens": token_count,
    }
    for each question
]

This decouples the search phase from the evaluation phase, allowing researchers to compare different aggregation and voting strategies without re-running the expensive tree search.

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment