Principle:Apache Druid Datasource Introspection
| Knowledge Sources | |
|---|---|
| Domains | Visual_Exploration, Schema_Introspection |
| Last Updated | 2026-02-10 00:00 GMT |
Overview
A schema introspection principle that discovers available columns and auto-detects suitable measures for a selected datasource in visual exploration.
Description
Datasource Introspection is the entry point for the visual data exploration workflow. When a user selects a datasource (or subquery), the system runs a LIMIT 0 introspection query to discover:
- Available columns: Names, SQL types, and native types for all columns in the datasource
- Auto-detected measures: Based on column types, the system proposes suitable aggregation functions (SUM for numeric columns, COUNT for general, APPROX_COUNT_DISTINCT for high-cardinality strings, APPROX_QUANTILE_DS for complex sketch types)
- Base columns: Original columns before any GROUP BY transformations
The introspection result creates a QuerySource object that drives all subsequent exploration — which columns can be used as dimensions, which as measures, and what filter options are available.
Usage
Use this principle at the start of any visual exploration session. It runs automatically when the user navigates to the Explore view or selects a different datasource.
Theoretical Basis
Datasource introspection follows a zero-cost metadata discovery pattern:
Introspection query:
SELECT * FROM (user_source) LIMIT 0
→ Returns column metadata without scanning data
QuerySource construction:
columns = result.columns.filter(validTypes)
measures = Measure.extractQueryMeasures(query)
|| autoDetect(columns) // SUM, COUNT, APPROX_COUNT_DISTINCT based on type