Principle:Datahub project Datahub Deployment Verification
| Field | Value |
|---|---|
| Principle Name | Deployment Verification |
| Namespace | Datahub_project_Datahub |
| Workflow | Docker_Quickstart_Deployment |
| Type | Principle |
| Last Updated | 2026-02-10 |
| Source Repository | datahub-project/datahub |
| Domains | Deployment, Docker, Metadata_Management |
Overview
The process of confirming that all DataHub containers are healthy and the web UI is accessible after deployment. Deployment verification inspects Docker container status using compose project labels, categorizes each container's health, and reports overall stack health.
Description
Deployment verification is critical because container orchestration involves asynchronous startup of interdependent services. A container being "running" does not mean the application inside it is ready to serve requests. DataHub's verification system provides a comprehensive health assessment.
Container Discovery
Containers are discovered using Docker labels set by Docker Compose. The filter com.docker.compose.project=datahub (where "datahub" is the compose project name, configurable via DATAHUB_COMPOSE_PROJECT_NAME) identifies all containers belonging to the DataHub stack.
Health Status Model
Each container is classified into one of seven states:
| Status | Meaning | Applies To |
|---|---|---|
| OK | Container is healthy and operational | All containers |
| STILL_RUNNING | Setup job has not yet completed | Containers with datahub_setup_job label
|
| EXITED_WITH_FAILURE | Setup job exited with non-zero code | Containers with datahub_setup_job label
|
| DIED | Long-running container is no longer running | Service containers |
| MISSING | Expected container does not exist | Any expected container |
| STARTING | Container is running but health check reports "starting" | Containers with health checks |
| UNHEALTHY | Container is running but health check reports unhealthy | Containers with health checks |
Two-Tier Container Classification
The system distinguishes between:
- Setup jobs -- Containers labeled with
datahub_setup_jobthat are expected to run once and exit successfully (exit code 0). Being in "running" state is considered still in progress. - Service containers -- Long-running containers (GMS, Frontend, MySQL, etc.) that should remain in "running" state with "healthy" health check status.
Missing Container Detection
The expected set of containers is loaded from the Docker Compose configuration files (read from the com.docker.compose.project.config_files label on existing containers). Any service defined in the compose file but not present as a container is classified as MISSING.
Aggregate Health
The QuickstartStatus dataclass aggregates individual container statuses and provides:
is_ok()-- Returns True only when all containers are in OK stateerrors()-- Returns list of human-readable error messagesneeds_up()-- Returns True if any container needs restart (EXITED_WITH_FAILURE, DIED, or MISSING)
Usage
After launching the DataHub Docker stack to confirm successful deployment. The verification is used in two contexts:
- Automatic -- During the
datahub docker quickstarthealth polling loop (every 2 seconds for up to 10 minutes) - Manual -- Via the
datahub docker checkcommand for on-demand health assessment
The frontend is accessible at http://localhost:9002 with default credentials (datahub/datahub) once all containers report OK.
Theoretical Basis
This principle follows the health check pattern -- poll container status until all services report healthy or a timeout is reached. This goes beyond simple process liveness by incorporating Docker health check results, which test actual application readiness (e.g., HTTP endpoint responding, database accepting connections).
The pattern also incorporates expected state reconciliation -- comparing the actual set of running containers against the expected set from the compose file to detect containers that failed to start entirely.
Knowledge Sources
- DataHub GitHub Repository
- DataHub Official Documentation
- Source file:
metadata-ingestion/src/datahub/cli/docker_check.py
Related Pages
- Implemented by: Datahub_project_Datahub_Check_Docker_Quickstart
Implementation:Datahub_project_Datahub_Check_Docker_Quickstart