Principle:Spcl Graph of thoughts Thought Generation
| Knowledge Sources | |
|---|---|
| Domains | Graph_Reasoning, Thought_Operations |
| Implementations | Implementation:Spcl_Graph_of_thoughts_Generate_Operation |
| Last Updated | 2026-02-14 |
Overview
Operation pattern that produces new thought states by prompting a language model with existing states or initial problem parameters.
Generation is the fundamental "thinking" step in the Graph of Thoughts framework. It is how the LLM actually generates content -- every piece of reasoning, every candidate solution, every decomposition of a problem originates from a Generate operation. Without generation, the graph has no thought states to score, filter, aggregate, or improve.
Core Concepts
The Role of Generate in the Graph
In the Graph of Thoughts, a thought is a dictionary of state representing a partial or complete solution. The Generate operation creates new thoughts by:
- Taking existing thought states from predecessor operations (or the initial problem parameters if there are no predecessors)
- Constructing a prompt from those states via the Prompter
- Sending the prompt to the language model
- Parsing the LLM's response(s) into new thought state dictionaries via the Parser
- Creating new Thought objects that carry those states forward in the graph
Generate operations can appear at any point in the graph: as root nodes that initiate reasoning from raw problem parameters, as intermediate nodes that refine or decompose existing thoughts, or as leaf-adjacent nodes that produce final candidate solutions.
Branching: How Multiple Thoughts Are Produced
The Generate operation controls the breadth of exploration through two parameters:
- num_branches_prompt -- the number of solutions requested within a single prompt. This value is passed to the Prompter, which embeds it in the prompt text (e.g., "Generate 5 different sorted lists"). The LLM returns all requested solutions in one response, and the Parser extracts each one as a separate thought.
- num_branches_response -- the number of separate LLM calls made for each prompt. Each call produces an independent response from the model (leveraging temperature-based randomness). This value is passed as
num_responsesto the language model'squery()method.
The total number of new thoughts produced from a single predecessor thought is:
num_branches_prompt x num_branches_response
When applied across all predecessor thoughts, the total new thoughts generated by the operation is:
previous_thoughts x num_branches_prompt x num_branches_response
Root Generation: Using Problem Parameters
When a Generate operation has no predecessors in the graph (i.e., it is a root node), it uses the kwargs passed to the Controller as the base state. These kwargs represent the initial problem parameters, such as the unsorted list in a sorting task or the documents in a merging task.
Specifically:
- If there are no predecessors and no predecessor operations at all, the kwargs are wrapped in a Thought and used as the base state
- If there are predecessor operations but they produced no thoughts (e.g., an upstream filter removed everything), the Generate operation produces nothing -- it does not fall back to kwargs
This distinction ensures that root Generate operations bootstrap the reasoning process, while downstream Generate operations respect the graph's data flow.
Prompt-Parse Pipeline
The Generate operation delegates prompt construction and response parsing to two collaborator objects:
- Prompter.generate_prompt(num_branches_prompt, **base_state) -- constructs the LLM prompt from the base state and the desired number of branches. The prompt format is domain-specific (sorting prompts differ from document merging prompts).
- Parser.parse_generate_answer(base_state, responses) -- extracts new state dictionaries from the LLM's text responses. Returns a list of dictionaries, each representing one new thought's state update.
The new thought's state is the union of the base state and the parsed update: {**base_state, **new_state}. This ensures that downstream operations have access to both the original context and the newly generated content.
Interaction with Other Operations
Generate operations typically appear in these patterns:
- Generate alone (IO pattern) -- a single Generate at the root produces one thought, which is directly evaluated
- Generate + Score + KeepBestN (ToT/GoT pattern) -- Generate produces multiple candidates, Score evaluates them, and KeepBestN selects the best
- Generate + Aggregate (GoT merge pattern) -- multiple Generate branches produce partial results, which are merged by Aggregate
- Generate + Improve (refinement pattern) -- a Generate produces an initial solution, and a subsequent Generate (via Improve) refines it based on feedback
Design Rationale
The two-parameter branching model (prompt branches vs. response branches) provides fine-grained control over exploration:
- Prompt branching is efficient when the LLM can produce multiple distinct outputs in a single call (saving API cost)
- Response branching leverages temperature-based stochasticity to get genuinely diverse outputs across multiple calls
The warning mechanism that fires when more thoughts are produced than expected (num_branches_prompt * num_branches_response * len(previous_thoughts)) helps detect Parser bugs that might create spurious thoughts from a single response.