Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Apache Dolphinscheduler Workflow DAG Definition

From Leeroopedia


Knowledge Sources
Domains Workflow_Orchestration, Data_Modeling
Last Updated 2026-02-10 00:00 GMT

Overview

A directed acyclic graph (DAG) data model that represents workflows as a set of task definitions connected by dependency edges, enabling visual orchestration and dependency-based execution ordering.

Description

The Workflow DAG Definition principle defines how DolphinScheduler models workflows as directed acyclic graphs persisted in a relational database. A workflow is composed of three entity types: WorkflowDefinition (the workflow metadata including name, version, project code), TaskDefinition (individual task nodes with type-specific parameters), and WorkflowTaskRelation (edges defining dependencies between tasks). This three-entity model separates the workflow structure from task logic, enabling task reuse across workflows and independent versioning.

The DAG structure ensures that tasks execute in topological order, respecting their dependency relationships. This approach supports complex orchestration patterns including parallel execution, conditional branching, sub-workflows, and task groups.

Usage

Use this principle when defining or modifying workflows through the DolphinScheduler API or UI. Every workflow requires at least one WorkflowDefinition and one or more TaskDefinition entities, connected by WorkflowTaskRelation edges that define the execution order.

Theoretical Basis

The DAG model applies graph theory to workflow orchestration:

  • Vertices: TaskDefinition entities represent computation units
  • Edges: WorkflowTaskRelation entities define "must run before" dependencies
  • Topological Sort: The execution engine processes tasks in topological order
  • Versioning: Both workflows and tasks are versioned, enabling rollback and audit
// DAG structure (abstract)
WorkflowDefinition:
    code: Long          // globally unique identifier
    name: String        // human-readable name
    version: Integer    // for versioning/rollback

TaskDefinition:
    code: Long          // globally unique identifier
    name: String        // task name
    taskType: String    // SHELL, SQL, PYTHON, SUB_PROCESS, etc.
    taskParams: String  // JSON-encoded type-specific parameters

WorkflowTaskRelation:
    preTaskCode: Long   // upstream task (0 = root)
    postTaskCode: Long  // downstream task

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment