Environment:DataTalksClub Data engineering zoomcamp Docker PostgreSQL Python Environment
| Knowledge Sources | |
|---|---|
| Domains | Infrastructure, Data_Ingestion |
| Last Updated | 2026-02-09 07:00 GMT |
01-docker-terraform/docker-sql/pipeline/pyproject.toml 01-docker-terraform/docker-sql/pipeline/docker-compose.yaml
Overview
Python 3.13+ environment with PostgreSQL 18, pgAdmin 4, and data ingestion libraries (pandas, SQLAlchemy, pyarrow) for the Docker-based taxi data pipeline.
Description
This environment provides the runtime context for the Module 1 Docker and PostgreSQL data ingestion pipeline. It consists of a Python 3.13 application container that downloads NYC taxi CSV data and loads it into a PostgreSQL 18 database using pandas chunked reading and SQLAlchemy. A pgAdmin 4 web UI is included for database inspection. The Python project uses the uv package manager and is containerized via a slim Python base image.
Usage
Use this environment for any data ingestion workflow that loads CSV data into PostgreSQL. It is the mandatory prerequisite for running the Docker_Compose_PostgreSQL_Setup, Pandas_Dtype_Configuration, SQLAlchemy_Create_Engine, Pandas_Chunked_CSV_Loading, and Docker_Build_Run implementations.
System Requirements
| Category | Requirement | Notes |
|---|---|---|
| OS | Linux, macOS, or Windows with Docker | Docker Desktop or Docker Engine required |
| Software | Docker Engine + Docker Compose | Compose V2 recommended |
| Disk | ~2GB free | For Docker images and taxi data |
| Network | Internet access | Downloads CSV data from GitHub releases |
Dependencies
Container Images
- `python:3.13.11-slim` (application base)
- `postgres:18` (database)
- `dpage/pgadmin4` (database UI)
Python Packages
From `pyproject.toml` (requires Python >= 3.13):
- `click` >= 8.3.1
- `pandas` >= 2.3.3
- `psycopg2-binary` >= 2.9.11
- `pyarrow` >= 22.0.0
- `sqlalchemy` >= 2.0.44
- `tqdm` >= 4.67.1
Dev Dependencies
- `jupyter` >= 1.1.1
- `pgcli` >= 4.3.0
Credentials
The following credentials are used in the development Docker Compose setup:
- `POSTGRES_USER`: PostgreSQL username (default: `root`)
- `POSTGRES_PASSWORD`: PostgreSQL password (default: `root`)
- `POSTGRES_DB`: Database name (default: `ny_taxi`)
- `PGADMIN_DEFAULT_EMAIL`: pgAdmin login email (default: `admin@admin.com`)
- `PGADMIN_DEFAULT_PASSWORD`: pgAdmin password (default: `root`)
Warning: These are development-only defaults. Never use these credentials in production.
Quick Install
# Start PostgreSQL and pgAdmin via Docker Compose
docker compose up -d
# Install Python dependencies (using uv)
pip install click>=8.3.1 pandas>=2.3.3 psycopg2-binary>=2.9.11 pyarrow>=22.0.0 sqlalchemy>=2.0.44 tqdm>=4.67.1
Code Evidence
Python version requirement from `pyproject.toml:6`:
requires-python = ">=3.13"
Dependency declarations from `pyproject.toml:7-14`:
dependencies = [
"click>=8.3.1",
"pandas>=2.3.3",
"psycopg2-binary>=2.9.11",
"pyarrow>=22.0.0",
"sqlalchemy>=2.0.44",
"tqdm>=4.67.1",
]
Docker Compose service definitions from `docker-compose.yaml:1-27`:
services:
pgdatabase:
image: postgres:18
environment:
POSTGRES_USER: "root"
POSTGRES_PASSWORD: "root"
POSTGRES_DB: "ny_taxi"
ports:
- "5432:5432"
pgadmin:
image: dpage/pgadmin4
ports:
- "8085:80"
Common Errors
| Error Message | Cause | Solution |
|---|---|---|
| `psycopg2.OperationalError: could not connect to server` | PostgreSQL container not running | Run `docker compose up -d` and wait for health check |
| `ModuleNotFoundError: No module named 'psycopg2'` | Missing psycopg2-binary | `pip install psycopg2-binary>=2.9.11` |
| Port 5432 already in use | Local PostgreSQL instance running | Stop local PostgreSQL or change port mapping in docker-compose.yaml |
Compatibility Notes
- Python version: Requires Python 3.13+. The `pyproject.toml` specifies `requires-python = ">=3.13"`. Earlier Python versions will not work.
- Docker Compose: Uses Compose V2 syntax (no `version` key). Docker Compose V1 (`docker-compose` command) may also work but V2 (`docker compose`) is recommended.
- ARM64 (Apple Silicon): All specified images have ARM64 variants available.
Related Pages
- Implementation:DataTalksClub_Data_engineering_zoomcamp_Docker_Compose_PostgreSQL_Setup
- Implementation:DataTalksClub_Data_engineering_zoomcamp_Pandas_Dtype_Configuration
- Implementation:DataTalksClub_Data_engineering_zoomcamp_SQLAlchemy_Create_Engine
- Implementation:DataTalksClub_Data_engineering_zoomcamp_Pandas_Chunked_CSV_Loading
- Implementation:DataTalksClub_Data_engineering_zoomcamp_Docker_Build_Run