Heuristic:Apache Druid Auto Granularity Selection
| Knowledge Sources | |
|---|---|
| Domains | Time Series, Query Optimization, Web Console, Data Visualization |
| Last Updated | 2026-02-10 10:00 GMT |
Overview
The Druid web console automatically selects the smallest time granularity that will produce no more than a target number of time buckets for a given query's WHERE clause time span, falling back to P1D when no time constraint can be determined.
Description
When the explore view or other visualization components need to display time-bucketed data, the getAutoGranularity() function in auto-granularity.ts analyzes the query's WHERE clause to extract the time span being queried, then selects an appropriate granularity.
The algorithm works in two steps:
- Time span extraction: The WHERE clause is analyzed using
getTimeSpanInExpression(), which looks fortimeIntervalpatterns (absolute start/end) ortimeRelativepatterns (duration-based like "last 7 days"). The extracted span is converted to milliseconds. - Granularity selection: The
pickSmallestGranularityThatFits()function iterates through a sorted list of 17 granularity options (from PT1S to P1Y) and returns the first one wherespan / granularity < maxEntries. If no granularity fits, the largest available granularity (P1Y) is returned as a last resort.
The full granularity progression is:
PT1S, PT2S, PT5S, PT15S, PT30S, PT1M, PT2M, PT5M, PT15M, PT30M, PT1H, PT3H, PT6H, P1D, P1W, P1M, P3M, P1Y
If the WHERE clause contains no parseable time constraint, the function returns P1D as a safe default.
Usage
Apply this heuristic when:
- Understanding why the explore view chose a particular time granularity for a chart
- Adjusting the
maxEntriesparameter to control chart density - Adding new granularity levels to the progression
- Debugging cases where the auto-selected granularity produces too many or too few data points
The Insight (Rule of Thumb)
- Action: Given a time span and a maximum bucket count, iterate from finest to coarsest granularity and pick the first granularity that keeps the number of buckets under the maximum. Default to
P1Dif no time span is available. - Value: Automatically produces a readable chart without user intervention. A 30-day query gets daily buckets (30 points), not hourly (720 points) or secondly (2,592,000 points). The user always sees a well-populated but not overcrowded visualization.
- Trade-off: The algorithm does not consider the actual data density -- it only looks at the time span. A 1-year span with data only in the last week will still use
P1WorP1Mgranularity. Also, theP1Dfallback may be too coarse or too fine depending on actual data ranges when no time filter is present.
Reasoning
The granularity progression uses human-friendly intervals (1, 2, 5, 15, 30 patterns) rather than arbitrary divisions. This ensures that time bucket boundaries align with natural clock positions (e.g., 5-minute marks, quarter-hours, midnight boundaries), making the resulting charts easier to read and interpret.
The algorithm is intentionally simple and deterministic: given the same time span and max entries, it always returns the same granularity. This predictability is important because the auto-granularity result feeds into SQL query generation, and inconsistent results would confuse users who re-run the same query.
Example calculations:
| Time Span | maxEntries | Selected Granularity | Buckets |
|---|---|---|---|
| 1 hour (3,600,000ms) | 100 | PT1M (60,000ms) | 60 |
| 24 hours (86,400,000ms) | 100 | PT15M (900,000ms) | 96 |
| 30 days (2,592,000,000ms) | 100 | P1D (86,400,000ms) | 30 |
| 1 year (31,536,000,000ms) | 100 | P1W (604,800,000ms) | ~52 |
Code Evidence
Granularity options list (auto-granularity.ts:39-58):
export const FINE_GRANULARITY_OPTIONS = [
'PT1S',
'PT2S',
'PT5S',
'PT15S',
'PT30S',
'PT1M',
'PT2M',
'PT5M',
'PT15M',
'PT30M',
'PT1H',
'PT3H',
'PT6H',
'P1D',
'P1W',
'P1M',
'P3M',
'P1Y',
];
Main entry point with P1D fallback (auto-granularity.ts:62-70):
export function getAutoGranularity(
where: SqlExpression,
timeColumnName: string,
maxEntries: number,
): string {
const timeSpan = getTimeSpanInExpression(where, timeColumnName);
if (!timeSpan) return 'P1D';
return pickSmallestGranularityThatFits(AUTO_GRANULARITY_OPTIONS, timeSpan, maxEntries).toString();
}
Core selection algorithm (auto-granularity.ts:78-87):
/**
* Picks the first granularity that will produce no more than maxEntities to fill the given span
* @param granularities - granularities to try in sorted from small to large
* @param span - the span to fit in ms
* @param maxEntities - the number of entities not to exceed
*/
export function pickSmallestGranularityThatFits(
granularities: Duration[],
span: number,
maxEntities: number,
): Duration {
for (const granularity of granularities) {
if (span / granularity.getCanonicalLength() < maxEntities) return granularity;
}
return granularities[granularities.length - 1];
}
Time span extraction from WHERE clause (auto-granularity.ts:23-37):
export function getTimeSpanInExpression(
expression: SqlExpression,
timeColumnName: string,
): number | undefined {
const patterns = fitFilterPatterns(expression);
for (const pattern of patterns) {
if (pattern.type === 'timeInterval' && pattern.column === timeColumnName) {
return pattern.end.valueOf() - pattern.start.valueOf();
} else if (pattern.type === 'timeRelative' && pattern.column === timeColumnName) {
return new Duration(pattern.rangeDuration).getCanonicalLength();
}
}
return;
}