Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Datahub project Datahub Docker Health Check Pattern

From Leeroopedia


Field Value
Page Type Implementation (Pattern Doc)
Workflow Docker_Quickstart_Deployment
Implementation Name Docker_Health_Check_Pattern
Repository Datahub_project_Datahub
Implements Principle:Datahub_project_Datahub_Service_Health_Monitoring
Last Updated 2026-02-09 17:00 GMT

Overview

Description

Docker_Health_Check_Pattern describes the health check configuration embedded in DataHub's Docker Compose quickstart files, combined with the environment variable setup performed by the CLI. This pattern ensures that each service in the DataHub stack reports its readiness accurately, and that dependent services only start after their dependencies are confirmed healthy.

The pattern has two components:

  1. Compose-level health checks -- Defined in the YAML compose specification, these instruct Docker to periodically probe each service's readiness.
  2. Environment variable configuration -- The _set_environment_variables function in the CLI sets variables that control service behavior, port mappings, and version selection.

Usage

Health checks are evaluated automatically by the Docker daemon during datahub docker quickstart. Operators can inspect current health status using:

# View health status of all quickstart containers
docker compose -f ~/.datahub/quickstart/docker-compose.yml ps

# View health check logs for a specific service
docker inspect --format='{{json .State.Health}}' datahub-gms

Code Reference

Source Location

File Lines Description
docker/quickstart/docker-compose.quickstart-profile.yml L110-165 Health check definitions for GMS, Frontend, and infrastructure services
metadata-ingestion/src/datahub/cli/docker_cli.py L176-207 _set_environment_variables function setting runtime configuration

Signature

def _set_environment_variables(
    version: Optional[str] = None,
    mysql_port: Optional[int] = None,
    kafka_broker_port: Optional[int] = None,
    elastic_port: Optional[int] = None,
) -> None:
    """Set environment variables consumed by the Docker Compose specification."""
    ...

Import

This is an internal function; it is not intended for direct import. It is called automatically by the quickstart command.

I/O Contract

Key Environment Variables

Variable Default Value Description
DATAHUB_VERSION Auto-detected from CLI package Docker image tag for all DataHub services
DATAHUB_MAPPED_GMS_PORT 8080 Host port mapped to the GMS container's HTTP port
DATAHUB_MAPPED_FRONTEND_PORT 9002 Host port mapped to the Frontend container's HTTP port
METADATA_SERVICE_AUTH_ENABLED false Whether GMS requires authentication tokens for API calls

Service Port Mappings

Service Container Port Default Host Port Health Check Method
GMS 8080 8080 HTTP GET to /health endpoint
Frontend 9002 9002 HTTP GET to / endpoint
MySQL 3306 Not exposed (internal) mysqladmin ping command
Elasticsearch 9200 Not exposed (internal) HTTP GET to /_cluster/health
Kafka 9092 Not exposed (internal) Broker metadata request via kafka-broker-api-versions

Health Check Configuration Pattern

Each service in the compose file uses a health check block following this structure:

services:
  datahub-gms:
    # ...
    healthcheck:
      test: ["CMD", "curl", "--fail", "http://localhost:8080/health"]
      interval: 10s
      timeout: 5s
      retries: 5
      start_period: 30s
    depends_on:
      mysql:
        condition: service_healthy
      elasticsearch:
        condition: service_healthy
      kafka:
        condition: service_healthy

Usage Examples

Inspecting Service Health

# Check which services are healthy after quickstart
docker compose -f ~/.datahub/quickstart/docker-compose.yml ps

# Example output:
# NAME                    STATUS                PORTS
# datahub-gms             running (healthy)     0.0.0.0:8080->8080/tcp
# datahub-frontend-react  running (healthy)     0.0.0.0:9002->9002/tcp
# mysql                   running (healthy)     3306/tcp
# elasticsearch           running (healthy)     9200/tcp
# kafka                   running (healthy)     9092/tcp

Debugging an Unhealthy Service

# View detailed health check history for GMS
docker inspect --format='{{json .State.Health}}' datahub-gms | python3 -m json.tool

# View container logs for a failing service
docker compose -f ~/.datahub/quickstart/docker-compose.yml logs datahub-gms --tail=50

Custom Environment Variable Override

# Override the GMS port mapping before launching quickstart
export DATAHUB_MAPPED_GMS_PORT=8081
datahub docker quickstart

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment