Implementation:DataTalksClub Data engineering zoomcamp Docker Compose PostgreSQL Setup
| Metadata | |
|---|---|
| Knowledge Sources | DataTalksClub/data-engineering-zoomcamp |
| Domains | Docker, PostgreSQL, pgAdmin, Infrastructure |
| Last Updated | 2026-02-09 14:00 GMT |
Overview
Concrete tool for provisioning a PostgreSQL database and pgAdmin web interface as co-located Docker containers using a single docker-compose.yaml file.
Description
This implementation defines a two-service Docker Compose stack for the NYC taxi data ingestion workflow. The first service, pgdatabase, runs PostgreSQL 18 and is pre-configured with a root user, root password, and ny_taxi database via environment variables. The second service, pgadmin, runs the pgAdmin 4 web interface for database administration. Both services use named volumes for persistent storage and expose ports to the host machine.
The Compose file follows the modern Compose Specification format (no version key required). Docker Compose automatically creates a default bridge network that allows the two services to communicate using their service names as DNS hostnames.
Usage
Use this implementation to stand up the complete database infrastructure for the data engineering zoomcamp pipeline. Run the Compose file before executing the data ingestion script. pgAdmin is accessible at http://localhost:8085 for visual query execution and database inspection.
Code Reference
Source Location: 01-docker-terraform/docker-sql/pipeline/docker-compose.yaml:L1-27
Signature:
services:
pgdatabase:
image: postgres:18
environment:
POSTGRES_USER: "root"
POSTGRES_PASSWORD: "root"
POSTGRES_DB: "ny_taxi"
volumes:
- ny_taxi_postgres_data:/var/lib/postgresql
ports:
- "5432:5432"
pgadmin:
image: dpage/pgadmin4
environment:
PGADMIN_DEFAULT_EMAIL: "admin@admin.com"
PGADMIN_DEFAULT_PASSWORD: "root"
volumes:
- pgadmin_data:/var/lib/pgadmin
ports:
- "8085:80"
volumes:
ny_taxi_postgres_data:
pgadmin_data:
Import: N/A (external tool, requires Docker Engine and Docker Compose installed on the host)
I/O Contract
Inputs
| Name | Type | Description |
|---|---|---|
docker-compose.yaml |
File | The Compose definition file located in the pipeline directory |
| Docker Engine | Runtime | A running Docker daemon on the host machine |
Outputs
| Name | Type | Description |
|---|---|---|
| PostgreSQL service | Container | PostgreSQL 18 server listening on host port 5432, database ny_taxi, user root
|
| pgAdmin service | Container | pgAdmin 4 web UI listening on host port 8085, login admin@admin.com / root
|
ny_taxi_postgres_data |
Named Volume | Persistent storage for PostgreSQL data files |
pgadmin_data |
Named Volume | Persistent storage for pgAdmin configuration and session data |
Usage Examples
Starting the environment:
# Navigate to the pipeline directory
cd 01-docker-terraform/docker-sql/pipeline
# Start all services in detached mode
docker-compose up -d
# Verify both containers are running
docker-compose ps
Connecting to PostgreSQL from the host:
# Using psql CLI
psql -h localhost -p 5432 -U root -d ny_taxi
# Using pgcli (included in dev dependencies)
pgcli -h localhost -p 5432 -U root -d ny_taxi
Stopping the environment:
# Stop and remove containers (data persists in named volumes)
docker-compose down
# Stop and remove containers AND volumes (destroys all data)
docker-compose down -v
Related Pages
- Principle:DataTalksClub_Data_engineering_zoomcamp_Environment_Setup
- Implementation:DataTalksClub_Data_engineering_zoomcamp_Docker_Build_Run
- Implementation:DataTalksClub_Data_engineering_zoomcamp_SQLAlchemy_Create_Engine
- Environment:DataTalksClub_Data_engineering_zoomcamp_Docker_PostgreSQL_Python_Environment