Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Spcl Graph of thoughts DocMergePrompter

From Leeroopedia
Knowledge Sources
Domains Prompt_Engineering, Document_Merging
Source File examples/doc_merge/doc_merge.py, Lines 20-299
Superclass graph_of_thoughts.prompter.Prompter (ABC)
Implements Principle Principle:Spcl_Graph_of_thoughts_Document_Merging_Prompt_Design
Last Updated 2026-02-14

Overview

DocMergePrompter is a domain-specific prompter that generates prompt strings for LLM-based merging of NDA documents with quality scoring. It is the only prompter in the GoT framework that implements LLM-based scoring via the score_prompt method. The class contains 14 class-level prompt template strings and subclasses the abstract Prompter base class. It is defined in the document merge example file.

Description

The class manages a rich set of prompt templates for document merging, improvement, aggregation, and LLM-based scoring. Unlike the sorting and keyword counting prompters, the generate_prompt method accepts documents (a list of strings), parts (a set of indices), and current (the intermediate merged document) instead of the typical original/current pattern.

Class Attribute Prompt Templates

Attribute Purpose
merge_doc_prompt_start Header for direct merge prompt (IO/ToT initial/GoT)
merge_doc_prompt_block Per-document block with <DocN> tags
merge_doc_prompt_cot_start CoT header with step-by-step approach
improve_summary_prompt_start Header for improvement prompt
improve_summary_prompt_block Per-document block for improvement context
improve_summary_prompt_end Summary NDA block within tags
score_prompt_base Scoring instructions with Redundancy/Retained scale definitions
score_prompt_block Per-document block for scoring context
score_prompt_end Summary NDA to be scored within tags
aggregate_full_prompt_base Header for full aggregation (originals + summaries visible)
aggregate_full_prompt_block1 Per-original-document block
aggregate_full_prompt_mid Transition between originals and summaries
aggregate_full_prompt_block2 Per-summary block within <SN> tags
aggregate_sub_prompt_base Header for subpart aggregation (summaries only)
aggregate_sub_prompt_generate Per-summary NDA block

Code Reference

Key Methods

class DocMergePrompter(prompter.Prompter):
    def generate_prompt(
        self,
        num_branches: int,
        documents: List[str],
        method: str,
        parts: Set[str],
        current: str,
        **kwargs,
    ) -> str:
        """
        Routes to merge or improve prompt templates based on method and current state.
        - IO: merge_doc_prompt_start + merge_doc_prompt_block for all documents
        - CoT: merge_doc_prompt_cot_start + merge_doc_prompt_block for all documents
        - ToT: merge prompt (if current is empty) or improve prompt (if current exists)
        - GoT: merge prompt for documents selected by 'parts' subset (if current is empty)
               or improve prompt for the 'parts' subset (if current exists)
        The 'parts' parameter selects which document indices to include.
        """

    def aggregation_prompt(self, state_dicts: List[Dict], **kwargs) -> str:
        """
        Two aggregation modes:
        1. Subpart aggregation (parts is non-empty and smaller than total documents):
           Uses aggregate_sub_prompt_base + aggregate_sub_prompt_generate.
           Shows only the summary NDAs, not the originals.
        2. Full aggregation (parts covers all documents):
           Uses aggregate_full_prompt_base + blocks showing originals + mid + summary blocks.
        """

    def score_prompt(self, state_dicts: List[Dict], **kwargs) -> str:
        """
        THE UNIQUE SCORING METHOD. Generates a prompt asking the LLM to evaluate:
        - Redundancy (1-10) in <Redundancy> tags
        - Retained Information (1-10) in <Retained> tags
        Presents the relevant original documents and the merged NDA <S>.
        Asserts exactly 1 state for individual scoring.
        Uses 'parts' to select which original documents to show.
        """

    def improve_prompt(self, **kwargs) -> str:
        """Not implemented (returns None)."""

    def validation_prompt(self, **kwargs) -> str:
        """Not implemented (returns None)."""

Instantiation

# From examples/doc_merge/doc_merge.py, line 725
executor = controller.Controller(
    lm,
    operations_graph,
    DocMergePrompter(),
    DocMergeParser(),
    {
        "documents": [data[2], data[3], data[4], data[5]],
        "parts": set(),
        "current": "",
        "method": method.__name__,
    },
)

I/O Contract

Input

Parameter Type Description
num_branches int Number of LLM responses to request
documents List[str] The list of NDA document texts to merge
method str Reasoning method: "io", "cot", "tot", "got", "got2"
parts Set[str] Set of document indices already processed; empty set means all documents
current str Intermediate merged NDA text; empty initially

Output

All methods return a str containing the formatted prompt. Merge/improve prompts instruct the LLM to output between <Merged>/</Merged> tags. Score prompts instruct output between <Redundancy>/<Retained> tags.

State Dictionary Keys

Key Type Description
documents List[str] All original NDA document texts
parts Set[int] Indices of documents included in this merge (empty = all)
current str Current merged NDA text
method str Reasoning approach identifier

Usage Examples

IO Merge Prompt

prompter = DocMergePrompter()
prompt = prompter.generate_prompt(
    num_branches=1,
    documents=["NDA text 1...", "NDA text 2...", "NDA text 3...", "NDA text 4..."],
    method="io",
    parts=set(),
    current="",
)
# Returns merge prompt with all 4 documents tagged as <Doc1>-<Doc4>

LLM-Based Score Prompt

prompter = DocMergePrompter()
prompt = prompter.score_prompt([
    {
        "documents": ["NDA 1...", "NDA 2...", "NDA 3...", "NDA 4..."],
        "parts": set(),
        "current": "Merged NDA text...",
    }
])
# Returns scoring prompt asking LLM to rate Redundancy and Retained Information (1-10)
# with scores in <Redundancy> and <Retained> XML tags

GoT2 Subpart Aggregation Prompt

prompter = DocMergePrompter()
prompt = prompter.aggregation_prompt([
    {"documents": ["NDA 1", "NDA 2", "NDA 3", "NDA 4"], "parts": {0, 1}, "current": "Merged NDA of docs 1-2..."},
    {"documents": ["NDA 1", "NDA 2", "NDA 3", "NDA 4"], "parts": {2, 3}, "current": "Merged NDA of docs 3-4..."},
])
# Returns aggregate_sub_prompt showing only the two summary NDAs for combination

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment