Overview
DocMergePrompter is a domain-specific prompter that generates prompt strings for LLM-based merging of NDA documents with quality scoring. It is the only prompter in the GoT framework that implements LLM-based scoring via the score_prompt method. The class contains 14 class-level prompt template strings and subclasses the abstract Prompter base class. It is defined in the document merge example file.
Description
The class manages a rich set of prompt templates for document merging, improvement, aggregation, and LLM-based scoring. Unlike the sorting and keyword counting prompters, the generate_prompt method accepts documents (a list of strings), parts (a set of indices), and current (the intermediate merged document) instead of the typical original/current pattern.
Class Attribute Prompt Templates
| Attribute |
Purpose
|
merge_doc_prompt_start |
Header for direct merge prompt (IO/ToT initial/GoT)
|
merge_doc_prompt_block |
Per-document block with <DocN> tags
|
merge_doc_prompt_cot_start |
CoT header with step-by-step approach
|
improve_summary_prompt_start |
Header for improvement prompt
|
improve_summary_prompt_block |
Per-document block for improvement context
|
improve_summary_prompt_end |
Summary NDA block within tags
|
score_prompt_base |
Scoring instructions with Redundancy/Retained scale definitions
|
score_prompt_block |
Per-document block for scoring context
|
score_prompt_end |
Summary NDA to be scored within tags
|
aggregate_full_prompt_base |
Header for full aggregation (originals + summaries visible)
|
aggregate_full_prompt_block1 |
Per-original-document block
|
aggregate_full_prompt_mid |
Transition between originals and summaries
|
aggregate_full_prompt_block2 |
Per-summary block within <SN> tags
|
aggregate_sub_prompt_base |
Header for subpart aggregation (summaries only)
|
aggregate_sub_prompt_generate |
Per-summary NDA block
|
Code Reference
Key Methods
class DocMergePrompter(prompter.Prompter):
def generate_prompt(
self,
num_branches: int,
documents: List[str],
method: str,
parts: Set[str],
current: str,
**kwargs,
) -> str:
"""
Routes to merge or improve prompt templates based on method and current state.
- IO: merge_doc_prompt_start + merge_doc_prompt_block for all documents
- CoT: merge_doc_prompt_cot_start + merge_doc_prompt_block for all documents
- ToT: merge prompt (if current is empty) or improve prompt (if current exists)
- GoT: merge prompt for documents selected by 'parts' subset (if current is empty)
or improve prompt for the 'parts' subset (if current exists)
The 'parts' parameter selects which document indices to include.
"""
def aggregation_prompt(self, state_dicts: List[Dict], **kwargs) -> str:
"""
Two aggregation modes:
1. Subpart aggregation (parts is non-empty and smaller than total documents):
Uses aggregate_sub_prompt_base + aggregate_sub_prompt_generate.
Shows only the summary NDAs, not the originals.
2. Full aggregation (parts covers all documents):
Uses aggregate_full_prompt_base + blocks showing originals + mid + summary blocks.
"""
def score_prompt(self, state_dicts: List[Dict], **kwargs) -> str:
"""
THE UNIQUE SCORING METHOD. Generates a prompt asking the LLM to evaluate:
- Redundancy (1-10) in <Redundancy> tags
- Retained Information (1-10) in <Retained> tags
Presents the relevant original documents and the merged NDA <S>.
Asserts exactly 1 state for individual scoring.
Uses 'parts' to select which original documents to show.
"""
def improve_prompt(self, **kwargs) -> str:
"""Not implemented (returns None)."""
def validation_prompt(self, **kwargs) -> str:
"""Not implemented (returns None)."""
Instantiation
# From examples/doc_merge/doc_merge.py, line 725
executor = controller.Controller(
lm,
operations_graph,
DocMergePrompter(),
DocMergeParser(),
{
"documents": [data[2], data[3], data[4], data[5]],
"parts": set(),
"current": "",
"method": method.__name__,
},
)
I/O Contract
Input
| Parameter |
Type |
Description
|
num_branches |
int |
Number of LLM responses to request
|
documents |
List[str] |
The list of NDA document texts to merge
|
method |
str |
Reasoning method: "io", "cot", "tot", "got", "got2"
|
parts |
Set[str] |
Set of document indices already processed; empty set means all documents
|
current |
str |
Intermediate merged NDA text; empty initially
|
Output
All methods return a str containing the formatted prompt. Merge/improve prompts instruct the LLM to output between <Merged>/</Merged> tags. Score prompts instruct output between <Redundancy>/<Retained> tags.
State Dictionary Keys
| Key |
Type |
Description
|
documents |
List[str] |
All original NDA document texts
|
parts |
Set[int] |
Indices of documents included in this merge (empty = all)
|
current |
str |
Current merged NDA text
|
method |
str |
Reasoning approach identifier
|
Usage Examples
IO Merge Prompt
prompter = DocMergePrompter()
prompt = prompter.generate_prompt(
num_branches=1,
documents=["NDA text 1...", "NDA text 2...", "NDA text 3...", "NDA text 4..."],
method="io",
parts=set(),
current="",
)
# Returns merge prompt with all 4 documents tagged as <Doc1>-<Doc4>
LLM-Based Score Prompt
prompter = DocMergePrompter()
prompt = prompter.score_prompt([
{
"documents": ["NDA 1...", "NDA 2...", "NDA 3...", "NDA 4..."],
"parts": set(),
"current": "Merged NDA text...",
}
])
# Returns scoring prompt asking LLM to rate Redundancy and Retained Information (1-10)
# with scores in <Redundancy> and <Retained> XML tags
GoT2 Subpart Aggregation Prompt
prompter = DocMergePrompter()
prompt = prompter.aggregation_prompt([
{"documents": ["NDA 1", "NDA 2", "NDA 3", "NDA 4"], "parts": {0, 1}, "current": "Merged NDA of docs 1-2..."},
{"documents": ["NDA 1", "NDA 2", "NDA 3", "NDA 4"], "parts": {2, 3}, "current": "Merged NDA of docs 3-4..."},
])
# Returns aggregate_sub_prompt showing only the two summary NDAs for combination
Related Pages