Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Datahub project Datahub Run Quickstart Preflight Checks

From Leeroopedia


Field Value
Implementation Name Run Quickstart Preflight Checks
Namespace Datahub_project_Datahub
Workflow Docker_Quickstart_Deployment
Type API Doc
Language Python
Last Updated 2026-02-10
Source Repository datahub-project/datahub
Source File metadata-ingestion/src/datahub/cli/docker_check.py, lines 102-128
Domains Deployment, Docker, Metadata_Management

Overview

The run_quickstart_preflight_checks() function validates that the host Docker environment meets minimum resource requirements before starting the DataHub quickstart stack. It checks total configured memory (>= 4.3 GB) and available disk space (>= 13 GB).

Function Signature

def run_quickstart_preflight_checks(client: docker.DockerClient) -> None:

Import

from datahub.cli.docker_check import run_quickstart_preflight_checks

Parameters

Parameter Type Required Description
client docker.DockerClient Yes An active Docker client instance obtained via get_docker_client() context manager.

Return Value

Returns None. On success, the function returns silently. On failure, it raises one of the following exceptions.

Exceptions

Exception Condition Message Pattern
DockerLowMemoryError Total Docker memory < 4.3 GB "Total Docker memory configured {configured}GB is below the minimum threshold {MIN_MEMORY_NEEDED}GB."
DockerLowDiskSpaceError Available Docker disk space < 13 GB "Total Docker disk space available {available}GB is below the minimum threshold {MIN_DISK_SPACE_NEEDED}GB."

Constants

Constant Value Description
MIN_MEMORY_NEEDED 4.3 (GB) Minimum Docker memory. Includes buffer for Docker under-reporting.
MIN_DISK_SPACE_NEEDED 13 (GB) Minimum available Docker disk space.

Implementation Details

The function performs two checks in order:

Memory Check

total_mem_configured = int(client.info()["MemTotal"])
if memory_in_gb(total_mem_configured) < MIN_MEMORY_NEEDED:
    raise DockerLowMemoryError(...)

Memory is read from the Docker daemon info endpoint via client.info()["MemTotal"] and converted to GB using the helper memory_in_gb() which divides by 1024 * 1024 * 1000.

Disk Space Check

result = client.containers.run(
    "alpine:latest",
    "sh -c \"df -B1 -P / | awk 'NR==2{print $2, $4}'\"",
    remove=True,
    stdout=True,
    stderr=True,
)

A disposable Alpine container is run to execute df within the Docker runtime filesystem. The output provides total and available bytes, which are converted to GB. The container is automatically removed after execution.

Supporting Functions

get_docker_client

@contextmanager
def get_docker_client() -> Iterator[docker.DockerClient]:

Context manager that returns a Docker client. Attempts docker.from_env() first, then falls back to ~/.docker/run/docker.sock for Docker Desktop 4.13.0+ compatibility. Pings the daemon to verify connectivity. Raises DockerNotRunningError on failure.

_docker_compose_v2

def _docker_compose_v2() -> List[str]:

Detects Docker Compose v2 installation. Returns ["docker", "compose"] for the plugin form or ["docker-compose"] for standalone. Raises DockerComposeVersionError if only v1 is found or no Compose is installed.

Usage Example

from datahub.cli.docker_check import get_docker_client, run_quickstart_preflight_checks

with get_docker_client() as client:
    run_quickstart_preflight_checks(client)
    # If we reach here, all checks passed
    print("Environment is ready for DataHub quickstart")

Call Context

The function is called within the quickstart() command in docker_cli.py at line 629:

# Run pre-flight checks.
with get_docker_client() as client:
    run_quickstart_preflight_checks(client)

Knowledge Sources

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment