Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Datahub project Datahub CLI Installation For Docker

From Leeroopedia


Field Value
Principle Name CLI Installation For Docker
Namespace Datahub_project_Datahub
Workflow Docker_Quickstart_Deployment
Type Principle
Last Updated 2026-02-10
Source Repository datahub-project/datahub
Domains Deployment, Docker, Metadata_Management

Overview

Installing the DataHub CLI specifically for Docker-based deployment management commands. While the same package as general CLI installation, this context focuses on the docker subcommand group (quickstart, nuke, ingest-sample-data). No connector extras are needed for Docker commands.

Description

The DataHub CLI is distributed as the acryl-datahub Python package via PyPI. For Docker-based deployment management, the base installation without any extras is sufficient because the Docker commands rely only on the framework dependencies (Click, Docker SDK, PyYAML, etc.) which are included in the default installation.

The CLI follows a single entry point pattern where one executable (datahub) provides multiple subcommand groups for different operational contexts:

  • datahub docker quickstart -- Launch the DataHub stack
  • datahub docker nuke -- Destroy the DataHub stack
  • datahub docker ingest-sample-data -- Load demonstration data
  • datahub docker check -- Verify container health

The docker subcommand group is registered via the entrypoints.py module at line 362 (datahub.add_command(docker)), which imports the docker Click group from datahub.cli.docker_cli.

The package requires Python >= 3.10 and installs the following key framework dependencies relevant to Docker operations:

  • click -- CLI framework
  • docker -- Docker SDK for Python (container management)
  • PyYAML -- Compose file parsing
  • requests / requests_file -- Downloading compose files
  • expandvars -- Environment variable expansion in compose files

Usage

When setting up a local DataHub development or evaluation environment using Docker. The installation is a single pip command with no extras required for Docker-only usage.

# Install the base package (sufficient for docker commands)
pip install acryl-datahub

# Verify installation
datahub version

# Now docker commands are available
datahub docker quickstart

For ingestion from external data sources, additional extras would be needed (e.g., pip install 'acryl-datahub[mysql,snowflake]'), but these are not required for Docker deployment management.

Theoretical Basis

This principle follows the single entry point pattern -- one CLI tool provides multiple subcommand groups for different operational contexts. This reduces cognitive overhead for users by consolidating all DataHub operations under a single command namespace rather than requiring separate tools for deployment, ingestion, and administration.

The pattern also enables shared infrastructure (configuration, telemetry, logging) across all subcommands while keeping individual command groups self-contained in their dependencies.

Knowledge Sources

Related Pages

Implementation:Datahub_project_Datahub_Pip_Install_Datahub_Docker

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment