Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Langgenius Dify Processing Node Configuration

From Leeroopedia
Knowledge Sources Dify
Domains RAG, Pipeline, Frontend
Last Updated 2026-02-12 00:00 GMT

Overview

Description

Processing Node Configuration is the principle that governs how document transformation and chunking nodes within a Dify RAG pipeline expose their configurable parameters. Processing nodes sit between the datasource ingestion stage and the embedding/indexing stage in the pipeline graph. They are responsible for transforming raw document content into structured chunks suitable for vector storage and retrieval.

The Dify pipeline architecture distinguishes between two categories of processing parameters:

  • Pre-processing parameters -- Configuration for operations that occur before the main document processing, such as content extraction, format conversion, or metadata enrichment. These are fetched from the /pre-processing/parameters endpoint.
  • Processing parameters -- Configuration for the core chunking and transformation logic, such as chunk size, overlap settings, separator rules, and model-specific options. These are fetched from the /processing/parameters endpoint.

Both parameter categories are scoped to a specific node within the pipeline graph, identified by a node_id. This per-node scoping enables different processing nodes in the same pipeline to have independent configurations.

Usage

Processing Node Configuration applies whenever a user:

  • Opens a processing node in the pipeline editor to view or modify its settings.
  • Configures chunking strategies, chunk sizes, overlap rules, or separator patterns.
  • Sets up pre-processing steps such as content cleaning, language detection, or format normalization.
  • Needs to understand what configurable variables a processing node exposes before executing the pipeline.
  • Compares processing parameters between the draft and published versions of a pipeline.

The parameters are returned as typed RAGPipelineVariable arrays, where each variable specifies its type (text input, number, select, file, etc.), label, default value, validation constraints, and the node it belongs to.

Theoretical Basis

Processing node configuration follows the Dynamic Form Generation pattern, where the backend defines the schema of configurable parameters and the frontend renders appropriate input controls based on the variable types. This approach provides several benefits:

  • Backend-Driven UI -- The processing node's configurable surface area is defined by the backend, ensuring consistency between what the API accepts and what the UI presents.
  • Type Safety -- Each variable carries a PipelineInputVarType that maps to a specific form field type via VAR_TYPE_MAP, preventing type mismatches.
  • Node Isolation -- The belong_to_node_id field on each variable ensures parameters are correctly scoped to their owning node, supporting complex pipelines with multiple processing stages.

The draft/published distinction in parameter endpoints supports the Environment Promotion pattern: users configure parameters against a draft workflow and only promote to published after validation. The staleTime: 0 configuration ensures that parameter queries always reflect the latest node configuration, critical during iterative editing sessions.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment