Principle:Apache Druid Tuning Parameters

Knowledge Sources	Apache Druid Druid Tuning Config
Domains	Data_Ingestion, Performance_Tuning
Last Updated	2026-02-10 00:00 GMT

Overview

A performance optimization principle that configures resource limits and operational parameters for the ingestion task execution engine.

Description

Tuning Parameters control how the Druid ingestion engine allocates and uses computational resources during data loading. These parameters directly impact ingestion speed, memory usage, and cluster stability.

Key tuning dimensions include:

Memory limits: maxRowsInMemory, maxBytesInMemory — control when in-memory data is flushed to intermediate segments
Concurrency: maxNumConcurrentSubTasks — how many parallel indexing tasks run simultaneously
Output sizing: maxTotalRows — total rows per output segment
Task behavior: forceGuaranteedRollup, buildV9Directly, chatHandlerTimeout

The tuning configuration is separate from partitioning because it controls execution behavior rather than data layout.

Usage

Use this principle after partitioning configuration to optimize ingestion performance for your specific data volume and cluster resources. Default values work for most cases — tune only when dealing with very large datasets, memory pressure, or specific performance requirements.

Theoretical Basis

Tuning follows a resource budgeting model:

Ingestion Resource Model:
  Memory: maxRowsInMemory × avgRowSize ≤ maxBytesInMemory ≤ JVM heap
  Parallelism: maxNumConcurrentSubTasks ≤ available worker capacity
  Output: segments ≈ totalRows / targetRowsPerSegment

Trade-offs:
  Higher maxRowsInMemory → fewer intermediate flushes → faster ingestion but more memory
  Higher maxNumConcurrentSubTasks → more parallelism → faster but more cluster resources

Related Pages

Implemented By

Implementation:Apache_Druid_Tuning_Config_Form

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment