Implementation:Datahub project Datahub Check Docker Quickstart
| Field | Value |
|---|---|
| Implementation Name | Check Docker Quickstart |
| Namespace | Datahub_project_Datahub |
| Workflow | Docker_Quickstart_Deployment |
| Type | API Doc |
| Language | Python |
| Last Updated | 2026-02-10 |
| Source Repository | datahub-project/datahub |
| Source File | metadata-ingestion/src/datahub/cli/docker_check.py, lines 239-308
|
| Domains | Deployment, Docker, Metadata_Management |
Overview
The check_docker_quickstart() function inspects all DataHub Docker containers, evaluates their health status, and returns a structured QuickstartStatus report. It also detects legacy (pre-profile) quickstart installations that require migration.
Function Signature
def check_docker_quickstart() -> QuickstartStatus:
Import
from datahub.cli.docker_check import check_docker_quickstart
Parameters
This function takes no parameters. It uses module-level constants for Docker label filters.
Return Value
Returns a QuickstartStatus dataclass:
@dataclass
class QuickstartStatus:
containers: List[DockerContainerStatus]
volumes: Set[str]
running_unsupported_version: bool
| Field | Type | Description |
|---|---|---|
containers |
List[DockerContainerStatus] |
Status of each container (name + status enum) |
volumes |
Set[str] |
Set of volume names from compose configuration |
running_unsupported_version |
bool |
True if legacy (pre-profile) quickstart detected |
ContainerStatus Enum
class ContainerStatus(enum.Enum):
OK = "is ok"
STILL_RUNNING = "is still running"
EXITED_WITH_FAILURE = "exited with an error"
DIED = "is not running"
MISSING = "is not present"
STARTING = "is still starting"
UNHEALTHY = "is running by not yet healthy"
QuickstartStatus Methods
| Method | Return Type | Description |
|---|---|---|
is_ok() |
bool |
True if no errors (all containers OK) |
errors() |
List[str] |
Human-readable error messages for non-OK containers |
needs_up() |
bool |
True if any container is EXITED_WITH_FAILURE, DIED, or MISSING |
to_exception(header, footer) |
QuickstartError |
Creates an exception with structured container status info |
get_containers() |
Set[str] |
Set of container names |
Implementation Details
Container Discovery (lines 241-248)
with get_docker_client() as client:
containers = client.containers.list(
all=True,
filters=DATAHUB_COMPOSE_PROJECT_FILTER,
ignore_removed=True,
)
Uses the Docker label filter com.docker.compose.project=datahub (from DATAHUB_COMPOSE_PROJECT_FILTER). The ignore_removed=True flag handles race conditions during container recreation.
Profile Detection (lines 253-261)
If the compose config file path contains /profiles/, the function delegates to check_docker_quickstart_profiles() for the newer profile-based compose format.
Expected Container Resolution (lines 263-269)
The expected set of containers and volumes is loaded from compose config files referenced in the com.docker.compose.project.config_files label on existing containers.
Health Evaluation (lines 274-295)
For each container, the logic evaluates:
if container.labels.get("datahub_setup_job", False):
# Setup jobs: check for exit code
if container.status != "exited":
status = ContainerStatus.STILL_RUNNING
elif container.attrs["State"]["ExitCode"] != 0:
status = ContainerStatus.EXITED_WITH_FAILURE
elif container.status != "running":
status = ContainerStatus.DIED
elif "Health" in container.attrs["State"]:
if container.attrs["State"]["Health"]["Status"] == "starting":
status = ContainerStatus.STARTING
elif container.attrs["State"]["Health"]["Status"] != "healthy":
status = ContainerStatus.UNHEALTHY
Missing Container Detection (lines 298-302)
missing_containers = set(all_containers) - existing_containers
for missing in missing_containers:
container_statuses.append(
DockerContainerStatus(missing, ContainerStatus.MISSING)
)
Legacy Detection (line 303)
running_unsupported_version = detect_legacy_quickstart_compose(all_containers)
Detects legacy quickstart by checking if "zookeeper" is in the container set.
Frontend Access
When all containers report OK, the DataHub frontend is accessible at:
- URL: http://localhost:9002
- Username: datahub
- Password: datahub
CLI Command
The datahub docker check command wraps this function (lines 150-162):
@docker.command()
def check() -> None:
"""Check that the Docker containers are healthy"""
status = check_docker_quickstart()
if status.is_ok():
click.secho("No issues detected", fg="green")
else:
raise status.to_exception("The following issues were detected:")
Usage Examples
from datahub.cli.docker_check import check_docker_quickstart
status = check_docker_quickstart()
if status.is_ok():
print("DataHub is healthy!")
else:
for error in status.errors():
print(f"Issue: {error}")
if status.needs_up():
print("Some containers need to be restarted")
# CLI usage
datahub docker check