Principle:Langgenius Dify Processing Node Configuration
| Knowledge Sources | Dify |
|---|---|
| Domains | RAG, Pipeline, Frontend |
| Last Updated | 2026-02-12 00:00 GMT |
Overview
Description
Processing Node Configuration is the principle that governs how document transformation and chunking nodes within a Dify RAG pipeline expose their configurable parameters. Processing nodes sit between the datasource ingestion stage and the embedding/indexing stage in the pipeline graph. They are responsible for transforming raw document content into structured chunks suitable for vector storage and retrieval.
The Dify pipeline architecture distinguishes between two categories of processing parameters:
- Pre-processing parameters -- Configuration for operations that occur before the main document processing, such as content extraction, format conversion, or metadata enrichment. These are fetched from the
/pre-processing/parametersendpoint. - Processing parameters -- Configuration for the core chunking and transformation logic, such as chunk size, overlap settings, separator rules, and model-specific options. These are fetched from the
/processing/parametersendpoint.
Both parameter categories are scoped to a specific node within the pipeline graph, identified by a node_id. This per-node scoping enables different processing nodes in the same pipeline to have independent configurations.
Usage
Processing Node Configuration applies whenever a user:
- Opens a processing node in the pipeline editor to view or modify its settings.
- Configures chunking strategies, chunk sizes, overlap rules, or separator patterns.
- Sets up pre-processing steps such as content cleaning, language detection, or format normalization.
- Needs to understand what configurable variables a processing node exposes before executing the pipeline.
- Compares processing parameters between the draft and published versions of a pipeline.
The parameters are returned as typed RAGPipelineVariable arrays, where each variable specifies its type (text input, number, select, file, etc.), label, default value, validation constraints, and the node it belongs to.
Theoretical Basis
Processing node configuration follows the Dynamic Form Generation pattern, where the backend defines the schema of configurable parameters and the frontend renders appropriate input controls based on the variable types. This approach provides several benefits:
- Backend-Driven UI -- The processing node's configurable surface area is defined by the backend, ensuring consistency between what the API accepts and what the UI presents.
- Type Safety -- Each variable carries a
PipelineInputVarTypethat maps to a specific form field type viaVAR_TYPE_MAP, preventing type mismatches. - Node Isolation -- The
belong_to_node_idfield on each variable ensures parameters are correctly scoped to their owning node, supporting complex pipelines with multiple processing stages.
The draft/published distinction in parameter endpoints supports the Environment Promotion pattern: users configure parameters against a draft workflow and only promote to published after validation. The staleTime: 0 configuration ensures that parameter queries always reflect the latest node configuration, critical during iterative editing sessions.