Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Apache Druid Tuning Parameters

From Leeroopedia


Knowledge Sources
Domains Data_Ingestion, Performance_Tuning
Last Updated 2026-02-10 00:00 GMT

Overview

A performance optimization principle that configures resource limits and operational parameters for the ingestion task execution engine.

Description

Tuning Parameters control how the Druid ingestion engine allocates and uses computational resources during data loading. These parameters directly impact ingestion speed, memory usage, and cluster stability.

Key tuning dimensions include:

  • Memory limits: maxRowsInMemory, maxBytesInMemory — control when in-memory data is flushed to intermediate segments
  • Concurrency: maxNumConcurrentSubTasks — how many parallel indexing tasks run simultaneously
  • Output sizing: maxTotalRows — total rows per output segment
  • Task behavior: forceGuaranteedRollup, buildV9Directly, chatHandlerTimeout

The tuning configuration is separate from partitioning because it controls execution behavior rather than data layout.

Usage

Use this principle after partitioning configuration to optimize ingestion performance for your specific data volume and cluster resources. Default values work for most cases — tune only when dealing with very large datasets, memory pressure, or specific performance requirements.

Theoretical Basis

Tuning follows a resource budgeting model:

Ingestion Resource Model:
  Memory: maxRowsInMemory × avgRowSize ≤ maxBytesInMemory ≤ JVM heap
  Parallelism: maxNumConcurrentSubTasks ≤ available worker capacity
  Output: segments ≈ totalRows / targetRowsPerSegment

Trade-offs:
  Higher maxRowsInMemory → fewer intermediate flushes → faster ingestion but more memory
  Higher maxNumConcurrentSubTasks → more parallelism → faster but more cluster resources

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment