Implementation:Datahub project Datahub Run Quickstart Preflight Checks
| Field | Value |
|---|---|
| Implementation Name | Run Quickstart Preflight Checks |
| Namespace | Datahub_project_Datahub |
| Workflow | Docker_Quickstart_Deployment |
| Type | API Doc |
| Language | Python |
| Last Updated | 2026-02-10 |
| Source Repository | datahub-project/datahub |
| Source File | metadata-ingestion/src/datahub/cli/docker_check.py, lines 102-128
|
| Domains | Deployment, Docker, Metadata_Management |
Overview
The run_quickstart_preflight_checks() function validates that the host Docker environment meets minimum resource requirements before starting the DataHub quickstart stack. It checks total configured memory (>= 4.3 GB) and available disk space (>= 13 GB).
Function Signature
def run_quickstart_preflight_checks(client: docker.DockerClient) -> None:
Import
from datahub.cli.docker_check import run_quickstart_preflight_checks
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
client |
docker.DockerClient |
Yes | An active Docker client instance obtained via get_docker_client() context manager.
|
Return Value
Returns None. On success, the function returns silently. On failure, it raises one of the following exceptions.
Exceptions
| Exception | Condition | Message Pattern |
|---|---|---|
DockerLowMemoryError |
Total Docker memory < 4.3 GB | "Total Docker memory configured {configured}GB is below the minimum threshold {MIN_MEMORY_NEEDED}GB." |
DockerLowDiskSpaceError |
Available Docker disk space < 13 GB | "Total Docker disk space available {available}GB is below the minimum threshold {MIN_DISK_SPACE_NEEDED}GB." |
Constants
| Constant | Value | Description |
|---|---|---|
MIN_MEMORY_NEEDED |
4.3 (GB) | Minimum Docker memory. Includes buffer for Docker under-reporting. |
MIN_DISK_SPACE_NEEDED |
13 (GB) | Minimum available Docker disk space. |
Implementation Details
The function performs two checks in order:
Memory Check
total_mem_configured = int(client.info()["MemTotal"])
if memory_in_gb(total_mem_configured) < MIN_MEMORY_NEEDED:
raise DockerLowMemoryError(...)
Memory is read from the Docker daemon info endpoint via client.info()["MemTotal"] and converted to GB using the helper memory_in_gb() which divides by 1024 * 1024 * 1000.
Disk Space Check
result = client.containers.run(
"alpine:latest",
"sh -c \"df -B1 -P / | awk 'NR==2{print $2, $4}'\"",
remove=True,
stdout=True,
stderr=True,
)
A disposable Alpine container is run to execute df within the Docker runtime filesystem. The output provides total and available bytes, which are converted to GB. The container is automatically removed after execution.
Supporting Functions
get_docker_client
@contextmanager
def get_docker_client() -> Iterator[docker.DockerClient]:
Context manager that returns a Docker client. Attempts docker.from_env() first, then falls back to ~/.docker/run/docker.sock for Docker Desktop 4.13.0+ compatibility. Pings the daemon to verify connectivity. Raises DockerNotRunningError on failure.
_docker_compose_v2
def _docker_compose_v2() -> List[str]:
Detects Docker Compose v2 installation. Returns ["docker", "compose"] for the plugin form or ["docker-compose"] for standalone. Raises DockerComposeVersionError if only v1 is found or no Compose is installed.
Usage Example
from datahub.cli.docker_check import get_docker_client, run_quickstart_preflight_checks
with get_docker_client() as client:
run_quickstart_preflight_checks(client)
# If we reach here, all checks passed
print("Environment is ready for DataHub quickstart")
Call Context
The function is called within the quickstart() command in docker_cli.py at line 629:
# Run pre-flight checks.
with get_docker_client() as client:
run_quickstart_preflight_checks(client)