Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:DataExpert io Data engineer handbook Docker Compose PostgreSQL Stack

From Leeroopedia


Overview

This page documents the docker-compose.yml configuration used to provision a PostgreSQL database and PGAdmin administration interface for the Dimensional Data Modeling workflow. The stack provides a fully containerized environment for running SQL exercises against the actor_films dataset.

Type

External Tool Doc

Source

Services

The Docker Compose stack defines two services:

Service: postgres

Property Value Description
Image postgres:14 Official PostgreSQL 14 Docker image.
Container Name ${DOCKER_CONTAINER} Configurable via environment variable for easy identification.
Restart Policy always Ensures the container restarts automatically on failure or host reboot.
Ports ${HOST_PORT}:5432 Maps the configurable host port to PostgreSQL's default internal port.
Schema ${POSTGRES_SCHEMA} Optional schema configuration for the database.

Environment Variables:

environment:
  POSTGRES_USER: ${POSTGRES_USER}
  POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
  POSTGRES_DB: ${POSTGRES_DB}

Volume Mounts:

volumes:
  - postgres-data:/var/lib/postgresql/data        # Named volume for data persistence
  - ./scripts/init-db.sh:/docker-entrypoint-initdb.d/init-db.sh  # Initialization script
  - ./data/data.dump:/docker-entrypoint-initdb.d/data.dump        # Base dataset dump
  - ./homework:/docker-entrypoint-initdb.d/homework               # Homework SQL files

The volume configuration achieves three goals:

  • Data persistence -- The named volume postgres-data ensures database contents survive container restarts.
  • Automatic initialization -- Files in /docker-entrypoint-initdb.d/ are automatically executed when the database is first created.
  • Homework injection -- The homework directory is mounted so that exercise SQL files are available to the init script.

Service: pgadmin

Property Value Description
Image dpage/pgadmin4 Official PGAdmin 4 Docker image providing a web-based database administration interface.
Ports ${PGADMIN_PORT}:80 Maps the configurable host port to PGAdmin's internal HTTP port.
Depends On postgres Ensures the PostgreSQL service starts before PGAdmin.

Environment Variables:

environment:
  PGADMIN_DEFAULT_EMAIL: ${PGADMIN_EMAIL}
  PGADMIN_DEFAULT_PASSWORD: ${PGADMIN_PASSWORD}

Configuration Parameters

All configurable parameters are injected via a .env file placed alongside the docker-compose.yml:

Parameter Description Example Value
DOCKER_CONTAINER Name assigned to the PostgreSQL container. dim-model-postgres
POSTGRES_SCHEMA Optional default schema for the database. public
POSTGRES_USER PostgreSQL superuser account name. postgres
POSTGRES_PASSWORD Password for the PostgreSQL superuser. postgres
POSTGRES_DB Name of the default database created on initialization. postgres
HOST_PORT Host machine port mapped to PostgreSQL's port 5432. 5433
PGADMIN_EMAIL Email address used as the PGAdmin login username. admin@admin.com
PGADMIN_PASSWORD Password for the PGAdmin web interface. admin
PGADMIN_PORT Host machine port mapped to PGAdmin's port 80. 5050

Outputs

The stack exposes two network endpoints on the host machine:

Service Host Endpoint Description
PostgreSQL localhost:${HOST_PORT} (default: localhost:5433) PostgreSQL database accessible via psql, DBeaver, or any PostgreSQL client.
PGAdmin http://localhost:${PGADMIN_PORT} (default: http://localhost:5050) Web-based administration interface for browsing tables, running queries, and inspecting data.

Usage

Starting the Stack

# Navigate to the dimensional modeling directory
cd intermediate-bootcamp/materials/1-dimensional-data-modeling

# Start all services in detached mode
docker compose up -d

Connecting to PostgreSQL

# Connect via psql from the host
psql -h localhost -p ${HOST_PORT} -U ${POSTGRES_USER} -d ${POSTGRES_DB}

# Or execute a query directly
psql -h localhost -p 5433 -U postgres -d postgres -c "SELECT COUNT(*) FROM actor_films;"

Stopping the Stack

# Stop services (preserves data)
docker compose down

# Stop services and remove data volumes (full reset)
docker compose down -v

Initialization Flow

The following sequence occurs when the stack starts for the first time:

1. Docker Compose creates the 'postgres-data' named volume
2. The postgres:14 container starts and initializes an empty database
3. PostgreSQL detects files in /docker-entrypoint-initdb.d/ and executes them:
   a. init-db.sh runs pg_restore to load data.dump
   b. init-db.sh iterates over homework/*.sql and executes each file via psql
4. PostgreSQL completes initialization and begins accepting connections
5. The pgadmin container starts and connects to the shared Docker network
6. PGAdmin becomes available on the configured host port

On subsequent startups (when the postgres-data volume already contains data), the initialization scripts are not re-executed. To force re-initialization, remove the volume with docker compose down -v.

Network Architecture

Host Machine
  |
  |-- localhost:${HOST_PORT} --> [postgres container:5432]
  |                                  |
  |-- localhost:${PGADMIN_PORT} --> [pgadmin container:80]
  |                                  |
  +-- Docker Network (internal) -----+
       (pgadmin connects to postgres by service name)

Within the Docker network, PGAdmin can reach PostgreSQL using the service name postgres as the hostname, avoiding the need to know the container's IP address.

Related Pages

Metadata

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment