Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Environment:DataExpert io Data engineer handbook Python Development Environment

From Leeroopedia


Knowledge Sources
Domains Infrastructure, Development
Last Updated 2026-02-09 06:00 GMT

Overview

Python 3.11+ development environment with Docker, required across all bootcamp modules.

Description

This environment defines the base software prerequisites shared across all Data Engineer Handbook bootcamp modules. It requires Python 3.11 or higher for local development and Docker for running containerized infrastructure. The intermediate bootcamp additionally requires a SQL editor such as DataGrip. Individual modules layer their own specific dependencies (PySpark, Flink, Flask) on top of this base environment.

Usage

Use this environment as the base prerequisite for all bootcamp workflows. Every module assumes Docker and Python are installed locally. The PySpark Testing workflow specifically requires local Python with pytest and chispa for running tests outside Docker.

System Requirements

Category Requirement Notes
OS Linux, macOS, or Windows Docker Desktop required on macOS/Windows
Language Python 3.11 or higher As specified in bootcamp prerequisites
Software Docker (Docker Desktop or Docker Engine) Required for all infrastructure modules
Software DataGrip or any SQL editor Required for intermediate bootcamp SQL exercises
Software Git For cloning the repository

Dependencies

System Packages

  • Python >= 3.11
  • Docker (Docker Desktop on macOS/Windows, Docker Engine on Linux)
  • pip (Python package manager, bundled with Python 3.11+)
  • Git (for repository access)

SQL Tooling

  • DataGrip (recommended) or any SQL editor capable of connecting to PostgreSQL

Credentials

No credentials are required for the base environment. Individual modules require their own credentials:

Quick Install

# Verify Python version (must be 3.11+)
python3 --version

# Verify Docker is installed
docker --version
docker compose version

# Clone the repository
git clone https://github.com/DataExpert-io/data-engineer-handbook.git
cd data-engineer-handbook

Code Evidence

Beginner bootcamp prerequisites from `beginner-bootcamp/software.md:1-4`:

Make sure your computer can run:
- Docker (install guide here)
- Python 3.11 (or higher)

Intermediate bootcamp prerequisites from `intermediate-bootcamp/software.md:1-7`:

Make sure your computer can run:
- Docker (install guide here)
- Python 3.11 (or higher)
- DataGrip (install here) (or any other SQL editor)

Common Errors

Error Message Cause Solution
`python3: command not found` Python not installed or not in PATH Install Python 3.11+ from python.org or via package manager
`docker: command not found` Docker not installed Install Docker Desktop (macOS/Windows) or Docker Engine (Linux)
`docker compose` not recognized Old Docker Compose v1 installed Upgrade to Docker Compose v2 (bundled with Docker Desktop)

Compatibility Notes

  • Python Version Conflict: The base requirement is Python 3.11+, but the Flink module requires Python 3.7.9 inside its Docker container. These do not conflict because Flink runs within Docker, not on the host Python.
  • macOS: Docker Desktop for Mac is required. Allocate at least 4GB RAM in Docker settings for Spark workloads.
  • Windows: WSL2 backend is required for Docker Desktop. Native Windows Python works for local development.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment