Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Workflow:Apache Druid Visual Data Exploration

From Leeroopedia


Knowledge Sources
Domains Data_Analysis, Visualization, Real_Time_Analytics
Last Updated 2026-02-10 10:00 GMT

Overview

End-to-end process for visually exploring and analyzing Druid datasources using the web console's Explore view, which provides drag-and-drop visualization building with interactive charts, tables, and filters.

Description

This workflow covers the visual data exploration path in the Druid web console. The Explore view is a no-code/low-code analytics interface that generates SQL queries behind the scenes based on user interactions. It provides a module-based visualization system supporting bar charts, pie charts, time-series charts, multi-axis charts, grouping tables, and record tables. Users select a datasource, define filters, choose dimensions and measures via drag-and-drop, and view results as interactive visualizations. The generated SQL is visible and can be copied to the Workbench for further refinement.

Key capabilities:

  • Module-based visualization system (bar chart, pie chart, time chart, multi-axis chart, grouping table, record table)
  • Drag-and-drop column and measure management from the resource pane
  • Rich filter system (time relative, time interval, values, contains, regexp, number range)
  • Custom SQL expression columns and aggregate measures
  • Automatic SQL query generation with macro expansion
  • URL hash and localStorage-synced state for bookmarkable views
  • Timezone-aware time axis rendering
  • Nested/complex column exploration

Usage

Execute this workflow when you want to visually explore Druid data without writing SQL, build quick dashboards or charts for analysis, or prototype visualizations before building production dashboards. This is ideal for business analysts, data scientists, and anyone who prefers a point-and-click interface over raw SQL.

Execution Steps

Step 1: Datasource Selection

Select the target datasource or table to explore. The Source Pane queries INFORMATION_SCHEMA to list all available Druid datasources and system tables. On first load, the Explore view automatically selects the first available datasource and introspects its schema.

Key considerations:

  • Datasource metadata is fetched via SQL queries against INFORMATION_SCHEMA
  • Column metadata includes names, data types, and nested column paths
  • Custom source queries allow exploring derived tables or subqueries
  • The selected datasource determines available columns and measures

Step 2: Filter Configuration

Define data filters using the Filter Pane pill bar. Filters restrict the queried data before aggregation. The filter menu supports multiple filter types depending on column data type: time-relative ranges, absolute time intervals, specific value selection, string contains, regex matching, and numeric range bounds.

Key considerations:

  • Time filters are critical for performance since Druid partitions data by time
  • Multiple filters combine with AND logic by default
  • The MAX_DATA_TIME() macro resolves to the latest data timestamp for relative ranges
  • Filter pills display a human-readable summary and can be edited or removed inline

Step 3: Visualization Module Selection

Choose a visualization module from the Module Picker dropdown. Each module type defines its own parameter schema (required columns, measures, options). Available modules include time charts for temporal analysis, bar and pie charts for categorical breakdowns, grouping tables for pivot-style aggregation, and record tables for raw data inspection.

Key considerations:

  • Each module type has different parameter requirements (e.g., time charts need a time column)
  • Module state persists in localStorage for session continuity
  • Multiple modules can be configured in sequence by switching module types
  • The selected module determines the SQL query structure (GROUP BY, ORDER BY, LIMIT)

Step 4: Dimension and Measure Configuration

Configure the dimensions (GROUP BY columns) and measures (aggregate expressions) for the selected visualization. The Resource Pane on the left lists all available columns with their types, supporting drag-and-drop into the Control Pane. Custom SQL expressions and aggregate measures can be defined via dialog editors.

Key considerations:

  • Dimensions are columns that define grouping (e.g., country, category, time floor)
  • Measures are aggregate functions (COUNT, SUM, AVG, MIN, MAX, APPROX_COUNT_DISTINCT)
  • Custom measures support arbitrary Druid SQL aggregate expressions
  • The AGGREGATE() macro allows referencing pre-defined measures in filter expressions
  • Named expressions support aliasing for readability

Step 5: Query Execution and Visualization

Execute the generated SQL query and render the results in the selected visualization module. The Explore view builds a SQL query from the configured filters, dimensions, and measures, expands any macros, and posts it to the Druid SQL API. Results are rendered as interactive charts or tables with tooltips, legends, and axis labels.

What happens:

  • The table-query builder constructs a SQL SELECT with GROUP BY, WHERE, ORDER BY, and LIMIT clauses
  • Query macros (AGGREGATE, MAX_DATA_TIME) are expanded into standard SQL
  • The query executes against the native Druid SQL endpoint for real-time results
  • Time charts support continuous rendering with D3-based SVG axes and area/line plots
  • Bar and pie charts use ECharts for interactive rendering with hover highlights

Step 6: Iterative Refinement

Refine the visualization by adjusting filters, changing dimensions or measures, switching module types, or modifying the source query. Each change triggers a new query execution and visualization update. The generated SQL is accessible for copying to the Workbench for further analysis.

Key considerations:

  • State is synced to the URL hash, making visualizations bookmarkable and shareable
  • The Source Query Pane shows the generated SQL for transparency and learning
  • Query log maintains a circular buffer of recent queries for debugging
  • Column interactions (click, right-click) in chart visualizations add filters dynamically

Execution Diagram

GitHub URL

Workflow Repository