Principle:Langgenius Dify Workflow Execution Monitoring

Knowledge Sources	Dify Dify Documentation
Domains	Workflow Observability Execution Monitoring
Last Updated	2026-02-08 00:00 GMT

Overview

Workflow execution monitoring is the real-time observability system that tracks running workflows with per-node status updates delivered through streaming events, enabling developers to observe execution progress, diagnose failures, and review historical run data.

Description

When a workflow executes, the system provides comprehensive observability through two complementary mechanisms: real-time streaming events during execution and queryable run history after completion.

SSE Event Streams: During workflow execution, the backend emits Server-Sent Events (SSE) that report the status of the workflow as a whole and each individual node. These events arrive in real time, allowing the visual builder to animate the canvas -- highlighting nodes as they begin execution, showing progress indicators, and marking nodes as succeeded or failed as results come in. This streaming approach is essential for workflows that may take minutes to complete due to LLM calls, external API requests, or iteration over large datasets.

Workflow Status State Machine: The overall workflow tracks its execution through a well-defined set of states:

Waiting -- The workflow has been submitted but has not begun processing
Running -- At least one node is actively executing
Succeeded -- All nodes completed successfully and the final output has been produced
Failed -- A node encountered an unrecoverable error
Stopped -- The execution was manually cancelled by the user

Node Status State Machine: Each node within the workflow tracks its own status independently, with a richer set of states:

NotStart -- The node has not yet been reached in the execution order
Waiting -- The node is queued and waiting for its dependencies to complete
Listening -- The node is waiting for an external event (e.g., a webhook or plugin trigger)
Running -- The node is actively processing
Succeeded -- The node completed without error
Failed -- The node encountered an error
Exception -- The node encountered an unexpected system-level error
Retry -- The node failed and is being retried according to its retry policy
Stopped -- The node was stopped due to workflow cancellation

Execution Tracing: Each node execution produces a NodeTracing record that captures the node's unique execution ID, the node definition ID, final status, wall-clock elapsed time, resolved inputs, produced outputs, and any error information. These traces persist beyond the execution and can be queried from the run history.

Run History: Completed workflow executions are stored and accessible through a run history API. This allows developers to review past executions, compare results across runs, and identify patterns in failures or performance degradation.

Usage

Execution monitoring is used during:

Live debugging: Watching a workflow execute in real time to observe the flow of data through nodes
Failure diagnosis: Examining which node failed, what inputs it received, and what error occurred
Performance analysis: Reviewing elapsed times per node to identify bottlenecks
Regression detection: Comparing run history across workflow versions to spot behavioral changes
Operational monitoring: Tracking the health and success rate of production workflows

Theoretical Basis

Workflow Status State Machine

              +--------- stop --------+
              |                       |
              v                       |
[Waiting] --> [Running] --> [Succeeded]
                  |
                  +--> [Failed]
                  |
                  +--> [Stopped]

The workflow begins in Waiting when submitted, transitions to Running when the first node begins execution, and terminates in one of three terminal states.

Node Status State Machine

[NotStart] --> [Waiting] --> [Running] --> [Succeeded]
                   |             |
                   v             +--> [Failed] --> [Retry] --> [Running]
              [Listening]        |                                 |
                   |             +--> [Exception]                  +--> [Failed]
                   v             |
              [Running]     [Stopped]

The node state machine is more complex because nodes can enter a Listening state while waiting for external events and can transition through Retry cycles before ultimately succeeding or failing.

SSE Event Flow

The streaming event protocol delivers events in a predictable sequence:

workflow_started
    |
    +--> node_started (node A)
    |        |
    |        +--> node_finished (node A, status: succeeded)
    |
    +--> node_started (node B)
    |        |
    |        +--> node_finished (node B, status: succeeded)
    |
    +--> node_started (node C)
    |        |
    |        +--> node_finished (node C, status: failed)
    |
workflow_finished (status: failed)

For iteration and loop nodes, additional events such as iteration_started, iteration_next, loop_started, and loop_next are emitted to track the progress of each cycle within these compound nodes.

Observability Data Model

WorkflowRun
  |-- id: string
  |-- status: WorkflowRunningStatus
  |-- elapsed_time: number
  |-- total_tokens: number
  |-- created_at: number
  |
  +-- NodeTracing[] (one per executed node)
        |-- id: string (unique execution ID)
        |-- node_id: string (definition ID)
        |-- status: NodeRunningStatus
        |-- elapsed_time: number
        |-- inputs: Record<string, any>
        |-- outputs: Record<string, any>
        |-- error: string | null

This hierarchical data model allows monitoring tools to present both the high-level workflow status and drill down into individual node executions for detailed inspection.

Related Pages

Implemented By

Implementation:Langgenius_Dify_UseWorkflowRunHistory

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment