Principle:Apache Druid Timestamp Configuration
| Knowledge Sources | |
|---|---|
| Domains | Data_Ingestion, Time_Series |
| Last Updated | 2026-02-10 00:00 GMT |
Overview
A time extraction principle that designates a primary timestamp column (__time) from ingested data for Druid's time-based partitioning and querying.
Description
Timestamp Configuration is a critical step in Druid data ingestion because Druid is fundamentally a time-series database. Every row stored in Druid must have a __time column that determines how data is partitioned into time-based segments and how time-based queries are optimized.
The timestamp configuration supports three modes:
- Column mode: Parse a timestamp from an existing data column using a format pattern (e.g., ISO 8601, epoch millis, custom format)
- Expression mode: Derive the timestamp from a Druid expression applied to one or more columns
- None mode: Use a constant timestamp (all rows assigned the same time value)
Usage
Use this principle after data parsing when structured columns are available. The timestamp configuration is mandatory for all Druid ingestion — without a valid __time column, data cannot be stored in Druid segments.
Theoretical Basis
Timestamp extraction follows a column-to-time mapping pattern:
TimestampSpec = { column: string, format: string, missingValue?: string }
Supported formats:
'iso' → ISO 8601 (2024-01-01T00:00:00Z)
'millis' → Unix epoch milliseconds
'posix' → Unix epoch seconds
'micro' → Unix epoch microseconds
'nano' → Unix epoch nanoseconds
'auto' → Auto-detect format
custom pattern → Java SimpleDateFormat pattern
The sampler validates the timestamp extraction by parsing both columns and the __time value in parallel, then merging the results for user verification.