Principle:Apache Druid Supervisor Health Monitoring
| Knowledge Sources | |
|---|---|
| Domains | Streaming_Ingestion, Monitoring |
| Last Updated | 2026-02-10 00:00 GMT |
Overview
A supervisor observability principle that monitors the health, lag, and throughput of streaming ingestion supervisors through status and statistics endpoints.
Description
Supervisor Health Monitoring provides real-time visibility into streaming supervisor performance:
- Status endpoint: Reports supervisor state (RUNNING, SUSPENDED, PENDING), healthy flag, consumer lag per partition, and latest offsets
- Statistics endpoint: Reports per-task throughput metrics (rows processed, bytes processed, processing rate) with 1.5-second refresh
- Spec history: Shows version history of supervisor spec changes
The Supervisors view displays all supervisors in a table with color-coded health indicators and provides drill-down dialogs for detailed metrics.
Usage
Use this principle for ongoing monitoring of streaming ingestion. The Supervisors view refreshes automatically to show current status and lag.
Theoretical Basis
Supervisor monitoring follows a multi-endpoint aggregation pattern:
Supervisor list:
SQL: SELECT * FROM sys.supervisors
REST: GET /druid/indexer/v1/supervisor?full
Per-supervisor status:
GET /druid/indexer/v1/supervisor/{id}/status
→ { state, detailedState, healthy, latestOffsets, minimumLag }
Per-supervisor stats:
GET /druid/indexer/v1/supervisor/{id}/stats
→ { taskId → { rows/s, bytes/s, totalRows, lag } }