Implementation:Apache Druid SchemaStep

Knowledge Sources	Apache Druid Druid SQL Ingestion
Domains	Data_Ingestion, SQL_Ingestion, Schema_Design
Last Updated	2026-02-10 00:00 GMT

Overview

Concrete React component for interactive SQL query building with column editing, partitioning, and clustering configuration.

Description

The SchemaStep component (1083 lines) is the main interactive step of the SQL data loader. It parses the current SQL query string, displays a column editor grid, and provides controls for PARTITIONED BY, CLUSTERED BY, rollup, and query preview. Changes to columns (add, remove, rename, cast, apply expression) are reflected back into the SQL query string via ingestQueryPatternToQuery().

The component uses postToSampler() for lightweight previews and submitTaskQuery() for full query execution previews with larger datasets.

Usage

Render this component as the main step in the SQL data loader wizard. It receives the SQL query string and updates it through the onQueryStringChange callback.

Code Reference

Source Location

Repository: Apache Druid
File: web-console/src/views/sql-data-loader-view/schema-step/schema-step.tsx
Lines: L265-L1083

Signature

interface SchemaStepProps {
  queryString: string;
  onQueryStringChange(queryString: string): void;
  enableAnalyze: boolean;
  onDone(): void;
}

export const SchemaStep = React.memo(function SchemaStep(
  props: SchemaStepProps,
): JSX.Element {
  // 1083-line component with column editor, preview, and SQL generation
});

Import

import { SchemaStep } from './schema-step/schema-step';

I/O Contract

Inputs

Name	Type	Required	Description
queryString	string	Yes	Current INSERT/REPLACE SQL query string
onQueryStringChange	callback	Yes	Called when the SQL query is modified by user interactions
enableAnalyze	boolean	Yes	Whether the rollup analysis feature is available
onDone	callback	Yes	Called when the user finishes schema configuration

Outputs

Name	Type	Description
queryString	string	Modified SQL query string with updated columns, types, partitioning, and clustering

Usage Examples

Generated SQL Output

-- Example SQL generated by SchemaStep:
REPLACE INTO "my_events" OVERWRITE ALL
SELECT
  TIME_PARSE("timestamp") AS "__time",
  "user_id",
  "event_type",
  CAST("value" AS DOUBLE) AS "value"
FROM TABLE(
  EXTERN(
    '{"type":"s3","uris":["s3://bucket/events.json"]}',
    '{"type":"json"}',
    '[{"name":"timestamp","type":"VARCHAR"},{"name":"user_id","type":"VARCHAR"},{"name":"event_type","type":"VARCHAR"},{"name":"value","type":"VARCHAR"}]'
  )
)
PARTITIONED BY DAY
CLUSTERED BY "user_id"

Related Pages

Implements Principle

Principle:Apache_Druid_SQL_Schema_Configuration

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment