Heuristic:Apache Druid Auto Granularity Selection

Knowledge Sources	Apache Druid
Domains	Time Series, Query Optimization, Web Console, Data Visualization
Last Updated	2026-02-10 10:00 GMT

Overview

The Druid web console automatically selects the smallest time granularity that will produce no more than a target number of time buckets for a given query's WHERE clause time span, falling back to P1D when no time constraint can be determined.

Description

When the explore view or other visualization components need to display time-bucketed data, the getAutoGranularity() function in auto-granularity.ts analyzes the query's WHERE clause to extract the time span being queried, then selects an appropriate granularity.

The algorithm works in two steps:

Time span extraction: The WHERE clause is analyzed using getTimeSpanInExpression(), which looks for timeInterval patterns (absolute start/end) or timeRelative patterns (duration-based like "last 7 days"). The extracted span is converted to milliseconds.
Granularity selection: The pickSmallestGranularityThatFits() function iterates through a sorted list of 17 granularity options (from PT1S to P1Y) and returns the first one where span / granularity < maxEntries. If no granularity fits, the largest available granularity (P1Y) is returned as a last resort.

The full granularity progression is:

PT1S, PT2S, PT5S, PT15S, PT30S, PT1M, PT2M, PT5M, PT15M, PT30M, PT1H, PT3H, PT6H, P1D, P1W, P1M, P3M, P1Y

If the WHERE clause contains no parseable time constraint, the function returns P1D as a safe default.

Usage

Apply this heuristic when:

Understanding why the explore view chose a particular time granularity for a chart
Adjusting the maxEntries parameter to control chart density
Adding new granularity levels to the progression
Debugging cases where the auto-selected granularity produces too many or too few data points

The Insight (Rule of Thumb)

Action: Given a time span and a maximum bucket count, iterate from finest to coarsest granularity and pick the first granularity that keeps the number of buckets under the maximum. Default to P1D if no time span is available.
Value: Automatically produces a readable chart without user intervention. A 30-day query gets daily buckets (30 points), not hourly (720 points) or secondly (2,592,000 points). The user always sees a well-populated but not overcrowded visualization.
Trade-off: The algorithm does not consider the actual data density -- it only looks at the time span. A 1-year span with data only in the last week will still use P1W or P1M granularity. Also, the P1D fallback may be too coarse or too fine depending on actual data ranges when no time filter is present.

Reasoning

The granularity progression uses human-friendly intervals (1, 2, 5, 15, 30 patterns) rather than arbitrary divisions. This ensures that time bucket boundaries align with natural clock positions (e.g., 5-minute marks, quarter-hours, midnight boundaries), making the resulting charts easier to read and interpret.

The algorithm is intentionally simple and deterministic: given the same time span and max entries, it always returns the same granularity. This predictability is important because the auto-granularity result feeds into SQL query generation, and inconsistent results would confuse users who re-run the same query.

Example calculations:

Time Span	maxEntries	Selected Granularity	Buckets
1 hour (3,600,000ms)	100	PT1M (60,000ms)	60
24 hours (86,400,000ms)	100	PT15M (900,000ms)	96
30 days (2,592,000,000ms)	100	P1D (86,400,000ms)	30
1 year (31,536,000,000ms)	100	P1W (604,800,000ms)	~52

Code Evidence

Granularity options list (auto-granularity.ts:39-58):

export const FINE_GRANULARITY_OPTIONS = [
  'PT1S',
  'PT2S',
  'PT5S',
  'PT15S',
  'PT30S',
  'PT1M',
  'PT2M',
  'PT5M',
  'PT15M',
  'PT30M',
  'PT1H',
  'PT3H',
  'PT6H',
  'P1D',
  'P1W',
  'P1M',
  'P3M',
  'P1Y',
];

Main entry point with P1D fallback (auto-granularity.ts:62-70):

export function getAutoGranularity(
  where: SqlExpression,
  timeColumnName: string,
  maxEntries: number,
): string {
  const timeSpan = getTimeSpanInExpression(where, timeColumnName);
  if (!timeSpan) return 'P1D';
  return pickSmallestGranularityThatFits(AUTO_GRANULARITY_OPTIONS, timeSpan, maxEntries).toString();
}

Core selection algorithm (auto-granularity.ts:78-87):

/**
 * Picks the first granularity that will produce no more than maxEntities to fill the given span
 * @param granularities - granularities to try in sorted from small to large
 * @param span - the span to fit in ms
 * @param maxEntities - the number of entities not to exceed
 */
export function pickSmallestGranularityThatFits(
  granularities: Duration[],
  span: number,
  maxEntities: number,
): Duration {
  for (const granularity of granularities) {
    if (span / granularity.getCanonicalLength() < maxEntities) return granularity;
  }
  return granularities[granularities.length - 1];
}

Time span extraction from WHERE clause (auto-granularity.ts:23-37):

export function getTimeSpanInExpression(
  expression: SqlExpression,
  timeColumnName: string,
): number | undefined {
  const patterns = fitFilterPatterns(expression);
  for (const pattern of patterns) {
    if (pattern.type === 'timeInterval' && pattern.column === timeColumnName) {
      return pattern.end.valueOf() - pattern.start.valueOf();
    } else if (pattern.type === 'timeRelative' && pattern.column === timeColumnName) {
      return new Duration(pattern.rangeDuration).getCanonicalLength();
    }
  }

  return;
}

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment