Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Datahub project Datahub Docker Prerequisites Validation

From Leeroopedia


Field Value
Principle Name Docker Prerequisites Validation
Namespace Datahub_project_Datahub
Workflow Docker_Quickstart_Deployment
Type Principle
Last Updated 2026-02-10
Source Repository datahub-project/datahub
Domains Deployment, Docker, Metadata_Management

Overview

The process of verifying that a host environment meets minimum requirements for running the DataHub Docker stack. Prerequisites validation checks Docker daemon availability, Docker Compose v2 installation, minimum memory (4.3 GB), and minimum disk space (13 GB). This prevents failed deployments due to insufficient resources.

Description

Docker Prerequisites Validation is a preflight check that runs before any container orchestration begins. The DataHub quickstart stack requires multiple services (GMS, Frontend, Kafka, Elasticsearch, MySQL, Schema Registry, Zookeeper) running simultaneously, which imposes non-trivial resource requirements on the host machine.

The validation performs four sequential checks:

  1. Docker Daemon Availability -- Verifies that the Docker daemon is running and reachable. The implementation attempts to connect to the Docker socket, falling back to ~/.docker/run/docker.sock for Docker Desktop 4.13.0+ compatibility. A ping is sent to confirm communication.
  2. Docker Compose v2 Installation -- Verifies that Docker Compose v2 (or later) is installed, either as the docker compose plugin or as the standalone docker-compose binary. Docker Compose v1 is explicitly rejected with an error message.
  3. Memory Check -- Queries the Docker daemon for total configured memory and verifies it meets the minimum threshold of 4.3 GB. The threshold includes a buffer because Docker tends to under-report allocated memory.
  4. Disk Space Check -- Runs a lightweight Alpine container to measure available disk space within the Docker runtime environment and verifies it meets the minimum threshold of 13 GB.

If any check fails, a descriptive exception is raised that tells the user exactly what to fix (e.g., increase Docker memory allocation, install Docker Compose v2).

Usage

Prerequisites validation is used before launching the DataHub quickstart stack to ensure the environment can support all required containers. It is invoked automatically as part of the datahub docker quickstart command and does not require explicit user action.

Typical scenarios:

  • First-time setup -- When a developer or evaluator runs DataHub locally for the first time, the preflight check catches misconfigured Docker Desktop settings before a lengthy download-and-start process.
  • CI/CD pipelines -- Automated environments benefit from early failure with actionable messages rather than cryptic container crashes mid-startup.
  • Resource-constrained environments -- Laptop environments or small VMs often have Docker configured with default (insufficient) memory allocations.

Theoretical Basis

This principle follows the preflight check pattern -- validate preconditions before expensive operations to fail fast with actionable error messages. Rather than allowing the Docker Compose orchestration to proceed and fail partway through (potentially leaving orphaned containers or partial state), the system validates all prerequisites upfront.

The pattern is analogous to aircraft preflight checklists: verify critical conditions before committing to an irreversible (or expensive-to-reverse) process. This reduces debugging time and improves the developer experience by providing clear guidance on resolution steps.

Knowledge Sources

Related Pages

Implementation:Datahub_project_Datahub_Run_Quickstart_Preflight_Checks

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment