Principle:Princeton nlp Tree of thought llm Usage Tracking
| Knowledge Sources | |
|---|---|
| Domains | Infrastructure, Experiment_Management |
| Last Updated | 2026-02-14 03:30 GMT |
Overview
A global accumulation mechanism that tracks API token consumption and computes estimated cost across all LLM calls during an experiment.
Description
Usage Tracking addresses the need to understand the computational cost of tree search experiments that make many LLM calls. Since the Tree of Thoughts approach requires multiple LLM calls per search step (generation + evaluation) across multiple depth levels and puzzle instances, the total token usage can be substantial. By tracking prompt and completion tokens globally, the framework enables:
- Cost estimation: Apply model-specific per-token pricing to compute dollar cost.
- Method comparison: Compare token efficiency between ToT BFS and naive baselines.
- Budgeting: Monitor spending across an experiment run in real time.
Usage
Use this principle at the end of an experiment run (or per-puzzle) to report accumulated token usage and cost. It is the final reporting step in both ToT BFS and baseline experiments.
Theoretical Basis
Token usage accumulates across all LLM calls during an experiment:
Failed to parse (syntax error): {\displaystyle \text{total\_cost} = \frac{\text{prompt\_tokens}}{1000} \times p_{\text{prompt}} + \frac{\text{completion\_tokens}}{1000} \times p_{\text{completion}} }
Where and are model-specific per-1K-token prices:
| Model | Prompt ($/1K) | Completion ($/1K) |
|---|---|---|
| gpt-4 | $0.03 | $0.06 |
| gpt-3.5-turbo | $0.0015 | $0.002 |
| gpt-4o | $0.01 | $0.0025 |