Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Environment:DataTalksClub Data engineering zoomcamp Docker PostgreSQL Python Environment

From Leeroopedia


Knowledge Sources
Domains Infrastructure, Data_Ingestion
Last Updated 2026-02-09 07:00 GMT

01-docker-terraform/docker-sql/pipeline/pyproject.toml 01-docker-terraform/docker-sql/pipeline/docker-compose.yaml

Overview

Python 3.13+ environment with PostgreSQL 18, pgAdmin 4, and data ingestion libraries (pandas, SQLAlchemy, pyarrow) for the Docker-based taxi data pipeline.

Description

This environment provides the runtime context for the Module 1 Docker and PostgreSQL data ingestion pipeline. It consists of a Python 3.13 application container that downloads NYC taxi CSV data and loads it into a PostgreSQL 18 database using pandas chunked reading and SQLAlchemy. A pgAdmin 4 web UI is included for database inspection. The Python project uses the uv package manager and is containerized via a slim Python base image.

Usage

Use this environment for any data ingestion workflow that loads CSV data into PostgreSQL. It is the mandatory prerequisite for running the Docker_Compose_PostgreSQL_Setup, Pandas_Dtype_Configuration, SQLAlchemy_Create_Engine, Pandas_Chunked_CSV_Loading, and Docker_Build_Run implementations.

System Requirements

Category Requirement Notes
OS Linux, macOS, or Windows with Docker Docker Desktop or Docker Engine required
Software Docker Engine + Docker Compose Compose V2 recommended
Disk ~2GB free For Docker images and taxi data
Network Internet access Downloads CSV data from GitHub releases

Dependencies

Container Images

  • `python:3.13.11-slim` (application base)
  • `postgres:18` (database)
  • `dpage/pgadmin4` (database UI)

Python Packages

From `pyproject.toml` (requires Python >= 3.13):

  • `click` >= 8.3.1
  • `pandas` >= 2.3.3
  • `psycopg2-binary` >= 2.9.11
  • `pyarrow` >= 22.0.0
  • `sqlalchemy` >= 2.0.44
  • `tqdm` >= 4.67.1

Dev Dependencies

  • `jupyter` >= 1.1.1
  • `pgcli` >= 4.3.0

Credentials

The following credentials are used in the development Docker Compose setup:

  • `POSTGRES_USER`: PostgreSQL username (default: `root`)
  • `POSTGRES_PASSWORD`: PostgreSQL password (default: `root`)
  • `POSTGRES_DB`: Database name (default: `ny_taxi`)
  • `PGADMIN_DEFAULT_EMAIL`: pgAdmin login email (default: `admin@admin.com`)
  • `PGADMIN_DEFAULT_PASSWORD`: pgAdmin password (default: `root`)

Warning: These are development-only defaults. Never use these credentials in production.

Quick Install

# Start PostgreSQL and pgAdmin via Docker Compose
docker compose up -d

# Install Python dependencies (using uv)
pip install click>=8.3.1 pandas>=2.3.3 psycopg2-binary>=2.9.11 pyarrow>=22.0.0 sqlalchemy>=2.0.44 tqdm>=4.67.1

Code Evidence

Python version requirement from `pyproject.toml:6`:

requires-python = ">=3.13"

Dependency declarations from `pyproject.toml:7-14`:

dependencies = [
    "click>=8.3.1",
    "pandas>=2.3.3",
    "psycopg2-binary>=2.9.11",
    "pyarrow>=22.0.0",
    "sqlalchemy>=2.0.44",
    "tqdm>=4.67.1",
]

Docker Compose service definitions from `docker-compose.yaml:1-27`:

services:
  pgdatabase:
    image: postgres:18
    environment:
      POSTGRES_USER: "root"
      POSTGRES_PASSWORD: "root"
      POSTGRES_DB: "ny_taxi"
    ports:
      - "5432:5432"
  pgadmin:
    image: dpage/pgadmin4
    ports:
      - "8085:80"

Common Errors

Error Message Cause Solution
`psycopg2.OperationalError: could not connect to server` PostgreSQL container not running Run `docker compose up -d` and wait for health check
`ModuleNotFoundError: No module named 'psycopg2'` Missing psycopg2-binary `pip install psycopg2-binary>=2.9.11`
Port 5432 already in use Local PostgreSQL instance running Stop local PostgreSQL or change port mapping in docker-compose.yaml

Compatibility Notes

  • Python version: Requires Python 3.13+. The `pyproject.toml` specifies `requires-python = ">=3.13"`. Earlier Python versions will not work.
  • Docker Compose: Uses Compose V2 syntax (no `version` key). Docker Compose V1 (`docker-compose` command) may also work but V2 (`docker compose`) is recommended.
  • ARM64 (Apple Silicon): All specified images have ARM64 variants available.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment