Heuristic:Apache Druid Query Error Suggestion Patterns
| Knowledge Sources | |
|---|---|
| Domains | SQL Query, Error Handling, Web Console, Developer Experience |
| Last Updated | 2026-02-10 10:00 GMT |
Overview
The Druid SQL editor implements a pattern-matching system that detects common SQL syntax mistakes from error messages returned by the server and offers one-click automated corrections.
Description
When a SQL query fails, the DruidError.getSuggestion() method in druid-query.ts parses the error message using a series of regular expressions to identify well-known mistake patterns. Each pattern returns a QuerySuggestion object with a human-readable label and a transform function that can fix the query string. The six detected patterns are:
- Double equals (
==) instead of single equals (=): Users coming from programming languages often write==in WHERE clauses. The error message patternReceived an unexpected token [= =]with line/column information is detected and the fix replaces==with=. - Fancy/Unicode quotes from copy-paste: When users copy SQL from documents or chat, Unicode quotes (
\u2018-\u201f) replace ASCII quotes. The lexical error patternEncountered: "\\u201\w"is detected and all fancy quotes are replaced with their ASCII equivalents. - Double-quoted string literals instead of single-quoted: In Druid SQL, double quotes delimit identifiers (column names) while single quotes delimit string literals. When a user writes
WHERE channel = "#en.wikipedia", the server reportsColumn '#en.wikipedia' not found. The fix swaps the quotes. - Single-quoted AS aliases: SQL aliases should use double quotes (
AS "alias"), not single quotes (AS 'alias'). The unexpected token pattern forAS \'...'is detected. - Trailing commas before keywords: A comma before
FROM,GROUP,ORDER, orLIMIT(e.g.,SELECT channel, FROM) is detected and removed. - Missing semicolons after SET statements: When multi-statement queries omit the semicolon between SET and SELECT, the parser detects this via the acceptable options list containing
";".
Additionally, MSQ (Multi-Stage Query) error responses nest their error objects inside an error key, requiring special unwrapping logic (annotated with the comment "Yo dawg, I heard you like errors...").
Usage
Apply these heuristics when:
- Extending the query editor with new auto-correction patterns
- Debugging why a user's query fails and whether a suggestion should have appeared
- Building client-side tooling that consumes Druid error responses
- Understanding how error position information (line/column) is extracted and used
The Insight (Rule of Thumb)
- Action: Parse SQL error messages with targeted regular expressions to detect common user mistakes, and offer specific one-click corrections positioned at the exact error location.
- Value: Dramatically reduces the frustration of syntax errors for users unfamiliar with Druid SQL conventions (identifier quoting, no
==, semicolons between statements). Converts a cryptic Calcite parser error into an actionable fix. - Trade-off: The regex patterns are tightly coupled to the exact error message format from Apache Calcite. If the upstream parser changes its error wording, these suggestions will silently stop matching. Each pattern must be maintained independently.
Reasoning
Druid SQL is parsed by Apache Calcite, whose error messages follow predictable patterns with line numbers and column positions. By exploiting this structure, the web console can extract the precise location of the error and apply a surgical string replacement. This is far more user-friendly than requiring the user to decode a Calcite parser error and manually locate the issue.
The six patterns chosen represent the most frequently observed mistakes based on real-world usage. The double-equals pattern catches programmers; the Unicode quotes pattern catches copy-paste from rich text; the quoting patterns catch users coming from MySQL/PostgreSQL where quoting conventions differ.
The MSQ error unwrapping is necessary because MSQ tasks return their errors wrapped in an additional layer: the HTTP response body has an error field that itself contains the actual ErrorResponse object, creating a nested structure.
Code Evidence
Pattern 1 -- Double equals detection (druid-query.ts:143-160):
static getSuggestion(errorMessage: string): QuerySuggestion | undefined {
// == is used instead of =
// ex: SELECT * FROM wikipedia WHERE channel == '#en.wikipedia'
// er: Received an unexpected token [= =] (line [1], column [39]), acceptable options:
const matchEquals =
/Received an unexpected token \[= =] \(line \[(\d+)], column \[(\d+)]\),/.exec(errorMessage);
if (matchEquals) {
const line = Number(matchEquals[1]);
const column = Number(matchEquals[2]);
return {
label: `Replace == with =`,
fn: str => {
const index = DruidError.positionToIndex(str, line, column);
if (!str.slice(index).startsWith('==')) return;
return `${str.slice(0, index)}=${str.slice(index + 2)}`;
},
};
}
Pattern 2 -- Fancy quote replacement (druid-query.ts:162-178):
// Mangled quotes from copy/paste
// ex: SELECT * FROM wikipedia WHERE channel = '#en.wikipedia'
// er: Lexical error at line 1, column 41. Encountered: "\u2018"
const matchLexical =
/Lexical error at line (\d+), column (\d+).\s+Encountered: "\\u201\w"/.exec(errorMessage);
if (matchLexical) {
return {
label: 'Replace fancy quotes with ASCII quotes',
fn: str => {
const newQuery = str
.replace(/[\u2018-\u201b]/gim, `'`)
.replace(/[\u201c-\u201f]/gim, `"`);
if (newQuery === str) return;
return newQuery;
},
};
}
Pattern 3 -- Double-quoted strings to single-quoted (druid-query.ts:180-201):
// Incorrect quoting on table column
// ex: SELECT * FROM wikipedia WHERE channel = "#en.wikipedia"
// er: Column '#en.wikipedia' not found in any table (line [1], column [41])
const matchQuotes =
/Column '([^']+)' not found in any table \(line \[(\d+)], column \[(\d+)]\)/.exec(
errorMessage,
);
if (matchQuotes) {
const literalString = matchQuotes[1];
const line = Number(matchQuotes[2]);
const column = Number(matchQuotes[3]);
return {
label: `Replace "${literalString}" with '${literalString}'`,
fn: str => {
const index = DruidError.positionToIndex(str, line, column);
if (!str.slice(index).startsWith(`"${literalString}"`)) return;
return `${str.slice(0, index)}'${literalString}'${str.slice(
index + literalString.length + 2,
)}`;
},
};
}
Pattern 5 -- Trailing commas (druid-query.ts:221-237):
// Comma (,) before FROM, GROUP, ORDER, or LIMIT
// ex: SELECT channel, FROM wikipedia
// er: Received an unexpected token [, FROM] (line [1], column [15]), acceptable options:
const matchComma = /Received an unexpected token \[, (FROM|GROUP|ORDER|LIMIT)]/i.exec(
errorMessage,
);
if (matchComma) {
const keyword = matchComma[1];
return {
label: `Remove comma (,) before ${keyword}`,
fn: str => {
const newQuery = str.replace(new RegExp(`,(\\s+${keyword})`, 'gim'), '$1');
if (newQuery === str) return;
return newQuery;
},
};
}
MSQ nested error unwrapping (druid-query.ts:76-86):
function errorResponseFromWhatever(e: any): ErrorResponse | string {
if (e.response) {
// This is a direct axios response error
let data = e.response.data || {};
// MSQ errors nest their error objects inside the error key. Yo dawg, I heard you like errors...
if (typeof data.error?.error === 'string') data = data.error;
return data;
} else {
return e; // Assume the error was passed in directly
}
}