Principle:Spcl Graph of thoughts Keyword Counting Response Parsing
| Knowledge Sources | |
|---|---|
| Domains | Response_Parsing, Keyword_Counting |
| Related Implementations | Implementation:Spcl_Graph_of_thoughts_KeywordCountingParser |
| Last Updated | 2026-02-14 |
Overview
Domain-specific parsing pattern for extracting keyword frequency dictionaries from LLM text responses.
Description
The Keyword Counting Response Parsing principle defines how raw LLM text output is transformed into structured thought state dictionaries containing JSON frequency counts of country names. The parser must handle multiple output formats -- paragraph-split JSON for decomposition, frequency dictionary JSON for counting, and combined dictionaries for aggregation -- while gracefully recovering from malformed responses.
Core Parsing Strategy: strip_answer_json
A central helper method underpins all parsing operations. It applies a consistent extraction pipeline to any LLM response:
- If the text contains "Output:", strip everything before it.
- Locate the last occurrence of
{and}in the remaining text. - Extract the substring between those positions (inclusive).
- Attempt to parse as JSON. If parsing fails, return
"{}"(empty dictionary).
This "last JSON object" strategy ensures robustness when the LLM includes intermediate reasoning, multiple dictionary attempts, or extraneous text before the final answer.
Parsing by Operation Type
Generate Parsing (parse_generate_answer): Two distinct paths:
- GoT Phase 0 (Split): The LLM returns a JSON object with keys like "Paragraph 1"..."Paragraph 4" (or "Sentence 1"..."Sentence N"). The parser extracts the JSON, iterates over its keys, and creates a new thought state for each paragraph/sentence. Each state gets
phase=1, apartidentifier (the key), andsub_text(the paragraph/sentence text). Thecurrentfield is set to empty string since counting has not yet occurred. - All other phases: The response is a frequency dictionary. The parser applies
strip_answer_jsonand sets the result ascurrentwithphase=2.
Aggregation Parsing (parse_aggregation_answer): Processes the result of merging two frequency dictionaries:
- Extracts the JSON dictionary from the response.
- Concatenates
sub_textfields from both input states (for sub-passage tracking). - Preserves the pre-aggregation dictionaries in
aggr1andaggr2fields, which the improve prompt needs for validation. - Handles edge cases where 0 or 1 input states exist by substituting empty dictionaries.
Improve Parsing (parse_improve_answer): Parses the corrected dictionary from a validation-and-improve response. Asserts exactly one response text, extracts JSON, and returns an updated state.
Error Handling
- All JSON extraction uses the "last braces" strategy to skip intermediate text.
- Failed JSON parsing falls back to
"{}"(empty dictionary) rather than raising exceptions. - Exceptions during generate answer parsing are caught and logged, resulting in no new states for that response.
- Warning-level keys: missing "Paragraph" or "Sentence" keys in split responses are logged but processing continues.
Phase Transitions
- Split responses (phase 0) produce states with
phase=1and populatedsub_text. - Count responses (phase 1 and non-GoT) produce states with
phase=2and populatedcurrent. - Aggregation does not change phase -- the aggregated state inherits from the first input state.
Related Pages
- Implementation:Spcl_Graph_of_thoughts_KeywordCountingParser -- Concrete Python class implementing this principle
- Principle:Spcl_Graph_of_thoughts_Keyword_Counting_Prompt_Design -- Companion prompt design principle
- Workflow:Spcl_Graph_of_thoughts_GoT_Keyword_Counting_Pipeline -- End-to-end workflow using this parsing