Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Diagram of thought Diagram of thought Node Edge Status Regex Extraction

From Leeroopedia
Knowledge Sources
Domains Parsing, Trace_Extraction, Protocol_Design
Last Updated 2026-02-14

Overview

Concrete pattern for extracting @node, @edge, and @status typed records from Diagram of Thought (DoT) output using regular expressions. This is a pattern doc -- there is no library to install. Users implement this parser themselves in their own codebase.

Description

Three regular expression patterns extract structured records from the raw DoT text stream:

  • Node pattern (@node id=(\d+) role=(\w+)): Captures the integer node identifier and the role string.
  • Edge pattern (@edge src=(\d+) dst=(\d+) kind=(\w+)): Captures the source node identifier, destination node identifier, and the relationship kind.
  • Status pattern (@status target=(\d+) mark=(\w+)): Captures the target node identifier and the validation mark.

Each pattern matches a single record type unambiguously. The patterns rely on the fixed field ordering and the @-prefixed syntax of the DoT serialization protocol, which ensures that natural language text in the same stream does not produce false matches.

Usage

Apply this extraction pattern after capturing raw DoT output from the LLM and before DAG reconstruction. The parsed records are the input to any downstream step that operates on the reasoning graph structure -- building an adjacency list, checking acyclicity, filtering validated nodes, or rendering a visualization.

Code Reference

Source Location

  • Repository: Diagram of Thought
  • File: README.md
  • Lines: L68-72 (typed record format), L116-124 (structural view with parsing)

Signature

import re

NODE_PATTERN   = r"@node id=(\d+) role=(\w+)"
EDGE_PATTERN   = r"@edge src=(\d+) dst=(\d+) kind=(\w+)"
STATUS_PATTERN = r"@status target=(\d+) mark=(\w+)"

def parse_dot_records(raw_output: str):
    """Extract all typed records from raw DoT output text."""
    nodes    = re.findall(NODE_PATTERN, raw_output)
    edges    = re.findall(EDGE_PATTERN, raw_output)
    statuses = re.findall(STATUS_PATTERN, raw_output)
    return nodes, edges, statuses

Import

import re

I/O Contract

Inputs

Name Type Required Description
raw_output str Yes Raw DoT output text containing interleaved natural language and typed records

Outputs

Name Type Description
nodes list[tuple(int, str)] List of extracted nodes, each as (id, role). Roles: problem, proposer, critic, summarizer
edges list[tuple(int, int, str)] List of extracted edges, each as (src, dst, kind). Kinds: use, critique, refine
statuses list[tuple(int, str)] List of extracted statuses, each as (target, mark). Marks: validated, invalidated

Usage Examples

Complete Python Parser

import re
from dataclasses import dataclass


@dataclass
class Node:
    id: int
    role: str  # problem | proposer | critic | summarizer


@dataclass
class Edge:
    src: int
    dst: int
    kind: str  # use | critique | refine


@dataclass
class Status:
    target: int
    mark: str  # validated | invalidated


def parse_dot_records(raw_output: str):
    """Extract all typed records from raw DoT output and return structured objects."""
    nodes = [
        Node(int(m[0]), m[1])
        for m in re.findall(r"@node id=(\d+) role=(\w+)", raw_output)
    ]
    edges = [
        Edge(int(m[0]), int(m[1]), m[2])
        for m in re.findall(r"@edge src=(\d+) dst=(\d+) kind=(\w+)", raw_output)
    ]
    statuses = [
        Status(int(m[0]), m[1])
        for m in re.findall(r"@status target=(\d+) mark=(\w+)", raw_output)
    ]
    return nodes, edges, statuses

Example Invocation

raw = """
<problem>
How many times does the letter "r" appear in the word "strawberry"?
@node id=1 role=problem

<proposer>
Let me step through each letter: s-t-r-a-w-b-e-r-r-y. I count 3 occurrences of "r".
@node id=2 role=proposer
@edge src=1 dst=2 kind=use

<critic>
Checking the count: position 3 is "r", position 8 is "r", position 9 is "r". Confirmed 3.
@node id=3 role=critic
@edge src=2 dst=3 kind=critique
@status target=2 mark=validated
"""

nodes, edges, statuses = parse_dot_records(raw)

# nodes:    [Node(id=1, role='problem'), Node(id=2, role='proposer'), Node(id=3, role='critic')]
# edges:    [Edge(src=1, dst=2, kind='use'), Edge(src=2, dst=3, kind='critique')]
# statuses: [Status(target=2, mark='validated')]

Related Pages

Implements Principle

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment