Environment:Datahub project Datahub Docker Quickstart Environment
| Knowledge Sources | |
|---|---|
| Domains | Infrastructure, Docker, Deployment |
| Last Updated | 2026-02-10 00:00 GMT |
Overview
Docker Desktop/Engine environment with Docker Compose v2, minimum 8GB RAM, 2 CPUs, and 13GB disk space for running the full DataHub stack locally.
Description
This environment defines the hardware and software prerequisites for deploying DataHub using the Docker Quickstart method. The stack includes GMS (backend), frontend, Elasticsearch, MySQL (or PostgreSQL), Kafka (with Zookeeper and Schema Registry), and optionally Neo4j. Docker Compose v2 is strictly required; v1 is explicitly rejected with a clear error message. The CLI performs automated preflight checks for memory (4.3GB minimum reported by Docker, accounting for Docker under-reporting) and disk space (13GB minimum).
Usage
Use this environment for Docker Quickstart Deployment workflows, including `datahub docker quickstart`, `datahub docker nuke`, and local development stacks. This is the standard way to run DataHub locally for development, testing, and evaluation.
System Requirements
| Category | Requirement | Notes |
|---|---|---|
| OS | Linux, macOS, Windows (Docker Desktop) | Any OS supporting Docker Desktop or Docker Engine |
| Docker | Docker Compose v2 or later | v1 explicitly rejected; detected via `docker compose version` |
| RAM | 8 GB minimum (Docker Desktop setting) | Preflight checks validate 4.3GB available (Docker under-reports) |
| CPU | 2 CPUs minimum | Allocated to Docker Desktop |
| Swap | 2 GB minimum | Docker Desktop swap allocation |
| Disk | 13 GB available | Checked before quickstart launch |
Dependencies
System Packages
- Docker Desktop (macOS/Windows) or Docker Engine + Docker Compose v2 (Linux)
- `python3` >= 3.10 (for `datahub` CLI that orchestrates Docker)
- `pip` (to install `acryl-datahub[docker]`)
Container Images
Default container image versions used by the quickstart:
| Service | Image | Default Version |
|---|---|---|
| MySQL | mysql | 8.2 |
| Elasticsearch | elasticsearch | 7.16.1 |
| Neo4j | neo4j | 4.4.9-community |
| Kafka Broker | confluentinc/cp-kafka | 7.9.2 |
| Schema Registry | confluentinc/cp-schema-registry | 7.9.2 |
| Zookeeper | confluentinc/cp-zookeeper | 7.9.2 |
Credentials
The following environment variables can be set for customization:
- `DATAHUB_VERSION`: DataHub image version tag to deploy
- `DATAHUB_COMPOSE_PROJECT_NAME`: Docker Compose project name (default: `datahub`)
- `METADATA_SERVICE_AUTH_ENABLED`: Enable GMS authentication (default: `false`)
- `DATAHUB_MAPPED_MYSQL_PORT`: Override MySQL port mapping
- `DATAHUB_MAPPED_KAFKA_BROKER_PORT`: Override Kafka broker port mapping
- `DATAHUB_MAPPED_ELASTIC_PORT`: Override Elasticsearch port mapping
Quick Install
# Install DataHub CLI with Docker support
pip install 'acryl-datahub[docker]'
# Launch the full stack
datahub docker quickstart
# Check stack health
datahub docker check
# Tear down and remove all data
datahub docker nuke
Code Evidence
Memory and disk space constants from `docker_check.py:16-18`:
# Docker seems to under-report memory allocated, so we also need a bit of buffer to account for it.
MIN_MEMORY_NEEDED = 4.3 # GB
MIN_DISK_SPACE_NEEDED = 13 # GB
Docker Compose v2 requirement check from `docker_cli.py:220-252`:
# Attempts docker compose version --short (v2)
# Falls back to docker-compose version --short (v1)
# Raises DockerComposeVersionError if only v1 found:
# "You have docker-compose v1 ({compose_version}) installed,
# but we require Docker Compose v2 or later."
Quickstart timeouts from `docker_cli.py:51-53`:
# Max wait time: 10 minutes
# Docker up timeout: 100 seconds per attempt
# Status check interval: 2 seconds
Common Errors
| Error Message | Cause | Solution |
|---|---|---|
| `You have docker-compose v1 installed, but we require Docker Compose v2 or later` | Docker Compose v1 detected | Upgrade to Docker Compose v2 (included in Docker Desktop 4.x+) |
| `DockerLowMemoryError` | Less than 4.3GB memory available to Docker | Increase Docker Desktop memory allocation to 8GB+ |
| `DockerLowDiskSpaceError` | Less than 13GB free disk space | Free up disk space or increase Docker disk allocation |
| `DockerNotRunningError` | Docker daemon not running | Start Docker Desktop or `systemctl start docker` |
| GMS health check timeout | GMS takes >90s to start | Ensure sufficient RAM; check for port conflicts on 8080 |
Compatibility Notes
- Default ports: Frontend=9002, GMS=8080, Elasticsearch=9200, Neo4j=7474/7687, Schema Registry=8081, Kafka=9092, Zookeeper=2181.
- Database backends: MySQL (default), PostgreSQL, MariaDB, Cassandra all supported via compose overrides.
- ARM64/M1 Macs: Use the `.m1.yml` override files for ARM-compatible images.
- Elasticsearch memory: Limited to 1GB within the container; Java heap configured separately.
Related Pages
- Implementation:Datahub_project_Datahub_Run_Quickstart_Preflight_Checks
- Implementation:Datahub_project_Datahub_Pip_Install_Datahub_Docker
- Implementation:Datahub_project_Datahub_Docker_CLI_Quickstart
- Implementation:Datahub_project_Datahub_Check_Docker_Quickstart
- Implementation:Datahub_project_Datahub_Docker_CLI_Ingest_Sample_Data
- Implementation:Datahub_project_Datahub_Docker_CLI_Lifecycle