Principle:Spcl Graph of thoughts Thought Generation

Knowledge Sources	Graph of Thoughts Graph of Thoughts: Solving Elaborate Problems with Large Language Models
Domains	Graph_Reasoning, Thought_Operations
Implementations	Implementation:Spcl_Graph_of_thoughts_Generate_Operation
Last Updated	2026-02-14

Overview

Operation pattern that produces new thought states by prompting a language model with existing states or initial problem parameters.

Generation is the fundamental "thinking" step in the Graph of Thoughts framework. It is how the LLM actually generates content -- every piece of reasoning, every candidate solution, every decomposition of a problem originates from a Generate operation. Without generation, the graph has no thought states to score, filter, aggregate, or improve.

Core Concepts

The Role of Generate in the Graph

In the Graph of Thoughts, a thought is a dictionary of state representing a partial or complete solution. The Generate operation creates new thoughts by:

Taking existing thought states from predecessor operations (or the initial problem parameters if there are no predecessors)
Constructing a prompt from those states via the Prompter
Sending the prompt to the language model
Parsing the LLM's response(s) into new thought state dictionaries via the Parser
Creating new Thought objects that carry those states forward in the graph

Generate operations can appear at any point in the graph: as root nodes that initiate reasoning from raw problem parameters, as intermediate nodes that refine or decompose existing thoughts, or as leaf-adjacent nodes that produce final candidate solutions.

Branching: How Multiple Thoughts Are Produced

The Generate operation controls the breadth of exploration through two parameters:

num_branches_prompt -- the number of solutions requested within a single prompt. This value is passed to the Prompter, which embeds it in the prompt text (e.g., "Generate 5 different sorted lists"). The LLM returns all requested solutions in one response, and the Parser extracts each one as a separate thought.

num_branches_response -- the number of separate LLM calls made for each prompt. Each call produces an independent response from the model (leveraging temperature-based randomness). This value is passed as num_responses to the language model's query() method.

The total number of new thoughts produced from a single predecessor thought is:

num_branches_prompt x num_branches_response

When applied across all predecessor thoughts, the total new thoughts generated by the operation is:

previous_thoughts x num_branches_prompt x num_branches_response

Root Generation: Using Problem Parameters

When a Generate operation has no predecessors in the graph (i.e., it is a root node), it uses the kwargs passed to the Controller as the base state. These kwargs represent the initial problem parameters, such as the unsorted list in a sorting task or the documents in a merging task.

Specifically:

If there are no predecessors and no predecessor operations at all, the kwargs are wrapped in a Thought and used as the base state
If there are predecessor operations but they produced no thoughts (e.g., an upstream filter removed everything), the Generate operation produces nothing -- it does not fall back to kwargs

This distinction ensures that root Generate operations bootstrap the reasoning process, while downstream Generate operations respect the graph's data flow.

Prompt-Parse Pipeline

The Generate operation delegates prompt construction and response parsing to two collaborator objects:

Prompter.generate_prompt(num_branches_prompt, **base_state) -- constructs the LLM prompt from the base state and the desired number of branches. The prompt format is domain-specific (sorting prompts differ from document merging prompts).

Parser.parse_generate_answer(base_state, responses) -- extracts new state dictionaries from the LLM's text responses. Returns a list of dictionaries, each representing one new thought's state update.

The new thought's state is the union of the base state and the parsed update: {**base_state, **new_state}. This ensures that downstream operations have access to both the original context and the newly generated content.

Interaction with Other Operations

Generate operations typically appear in these patterns:

Generate alone (IO pattern) -- a single Generate at the root produces one thought, which is directly evaluated
Generate + Score + KeepBestN (ToT/GoT pattern) -- Generate produces multiple candidates, Score evaluates them, and KeepBestN selects the best
Generate + Aggregate (GoT merge pattern) -- multiple Generate branches produce partial results, which are merged by Aggregate
Generate + Improve (refinement pattern) -- a Generate produces an initial solution, and a subsequent Generate (via Improve) refines it based on feedback

Design Rationale

The two-parameter branching model (prompt branches vs. response branches) provides fine-grained control over exploration:

Prompt branching is efficient when the LLM can produce multiple distinct outputs in a single call (saving API cost)
Response branching leverages temperature-based stochasticity to get genuinely diverse outputs across multiple calls

The warning mechanism that fires when more thoughts are produced than expected (num_branches_prompt * num_branches_response * len(previous_thoughts)) helps detect Parser bugs that might create spurious thoughts from a single response.

GitHub URL

graph_of_thoughts/operations/operations.py

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment