Principle:Datahub project Datahub Service Health Monitoring

Field	Value
Page Type	Principle
Workflow	Docker_Quickstart_Deployment
Principle Name	Service_Health_Monitoring
Repository	Datahub_project_Datahub
Implemented By	Implementation:Datahub_project_Datahub_Docker_Health_Check_Pattern
Last Updated	2026-02-09 17:00 GMT

Overview

Description

Service_Health_Monitoring is the principle of monitoring containerized service readiness through health check mechanisms. In a multi-service deployment like DataHub, each service must not only be running but must also be ready to accept requests before dependent services attempt to connect. This principle distinguishes between a container being alive (the process is running) and being ready (the service can handle traffic), and mandates that orchestration logic respect this distinction.

Usage

This principle applies throughout the lifecycle of a DataHub quickstart deployment:

During startup -- Docker Compose uses health checks to determine when infrastructure services (MySQL, Elasticsearch, Kafka) are ready before starting application services (GMS, Frontend).
During steady-state operation -- Health checks detect service degradation or failure, enabling automatic container restarts.
During troubleshooting -- Operators inspect health status via docker compose ps to identify which service in the dependency chain has failed.

Health checks are configured per-service in the Docker Compose specification and are evaluated at regular intervals by the Docker daemon.

Theoretical Basis

Health Check Pattern in Container Orchestration

The health check pattern is a fundamental construct in container orchestration frameworks. It originated in the microservices architecture movement as a response to the limitations of simple process monitoring:

Process liveness is insufficient -- A process can be running (PID exists, memory allocated) yet completely unresponsive. For example, a Java application may be running but stuck in a garbage collection pause, or a database may be running but not yet accepting connections because it is replaying its write-ahead log.
Application-level probes are required -- A health check must exercise the application's ability to do useful work. For a database, this means executing a simple query. For an HTTP service, this means receiving a 200 response from a health endpoint.

Readiness vs Liveness Probes

Container orchestration distinguishes between two types of probes:

Probe Type	Purpose	Failure Action	DataHub Example
Liveness	Detect whether the process is alive and not deadlocked	Restart the container	GMS process is running but unresponsive
Readiness	Detect whether the service is ready to handle traffic	Remove from load balancer / delay dependents	GMS is running but has not finished connecting to MySQL

In Docker Compose (as opposed to Kubernetes), the healthcheck directive combined with depends_on conditions serves both purposes: it gates dependent service startup (readiness) and can trigger restart policies (liveness).

Dependency Ordering Through Health Gates

DataHub's service dependency graph requires strict ordering:

MySQL ──────────┐
Elasticsearch ──┼──> GMS ──> Frontend
Kafka ──────────┘

Each arrow represents a dependency that must be health-gated, not merely start-order-gated. The depends_on directive with condition: service_healthy ensures that GMS does not attempt to start until MySQL, Elasticsearch, and Kafka have all passed their respective health checks.

Without health gates, GMS might start before MySQL is ready, fail to establish a database connection, and crash. Restart policies could eventually resolve this, but the result is a noisy, slow, and unpredictable startup experience.

Related Pages

Implementation:Datahub_project_Datahub_Docker_Health_Check_Pattern -- The concrete health check definitions in the Docker Compose specification.
Principle:Datahub_project_Datahub_Docker_Prerequisites -- Pre-deployment validation that complements post-deployment health monitoring.
Principle:Datahub_project_Datahub_Quickstart_Launch -- The deployment process that relies on health checks for correct startup ordering.

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment