Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Environment:Apache Flink Python PyFlink Environment

From Leeroopedia


Knowledge Sources
Domains Infrastructure, Python
Last Updated 2026-02-09 13:00 GMT

Overview

Python 3.9+ environment with Py4J, Apache Beam, and Cython for building and running PyFlink, the Python API for Apache Flink.

Description

This environment provides the Python runtime and dependencies required to build, install, and run PyFlink. The `flink-python` module uses setuptools with Cython extensions for performance-critical paths. The minimum Python version is 3.9, with support up to Python 3.12. Key dependencies include Py4J for JVM interoperability, Apache Beam for portable runner support, and PyArrow for columnar data exchange.

The build system uses `pyproject.toml` with setuptools and requires Cython >= 0.29.24 for compiling native extensions. Wheel builds are configured for CPython 3.9 through 3.12.

Usage

Use this environment when developing, building, or running PyFlink applications. This includes writing Python UDFs (User Defined Functions), using the Table API from Python, or running Flink jobs via the Python API. This environment is separate from the Java build environment but typically used alongside it.

System Requirements

Category Requirement Notes
OS Linux, macOS Windows has limited support
Hardware x86_64 or ARM64 CPU No GPU required
RAM 2GB minimum More needed for large PyArrow operations
Disk 5GB SSD For Python packages and build artifacts

Dependencies

System Packages

  • Python >= 3.9, <= 3.12
  • C compiler (gcc or clang) for Cython extensions
  • Java 11/17/21 (for Py4J bridge)

Python Packages

  • `py4j` == 0.10.9.7
  • `python-dateutil` >= 2.8.0, < 3
  • `apache-beam` >= 2.54.0, <= 2.61.0
  • `cloudpickle` >= 2.2.0
  • `avro` >= 1.12.0
  • `pytz` >= 2018.3
  • `fastavro` >= 1.1.0, != 1.8.0
  • `requests` >= 2.26.0
  • `protobuf` >= 3.19.0
  • `numpy` >= 1.22.4
  • `pandas` >= 1.3.0, < 2.3
  • `pyarrow` >= 5.0.0, < 21.0.0
  • `pemja` >= 0.5.6, < 0.5.7 (Linux/macOS only)
  • `httplib2` >= 0.19.0
  • `ruamel.yaml` >= 0.18.4

Build Dependencies

  • `setuptools` >= 75.3
  • `wheel`
  • `cython` >= 0.29.24

Credentials

No credentials are required for building PyFlink from source. All dependencies are available from PyPI.

Quick Install

# Ensure Python 3.9+ is available
python3 --version

# Install build dependencies
pip install setuptools>=75.3 wheel cython>=0.29.24

# Install PyFlink from the repository
cd flink-python
python setup.py sdist
pip install dist/apache-flink-*.tar.gz

# Or install runtime dependencies directly
pip install py4j==0.10.9.7 apache-beam>=2.54.0 cloudpickle>=2.2.0 \
    python-dateutil>=2.8.0 avro>=1.12.0 pyarrow>=5.0.0 \
    numpy>=1.22.4 pandas>=1.3.0 protobuf>=3.19.0

Code Evidence

Python version check from `flink-python/setup.py:31-34`:

if sys.version_info < (3, 9):
    print("Python versions prior to 3.9 are not supported for PyFlink.",
          file=sys.stderr)
    sys.exit(-1)

Python version constraint from `flink-python/setup.py:346`:

python_requires='>=3.9'

Wheel build matrix from `flink-python/pyproject.toml:81`:

build = ["cp39-*", "cp310-*", "cp311-*", "cp312-*"]

Fastavro exclusion from `flink-python/setup.py:325`:

'fastavro>=1.1.0,!=1.8.0',

Common Errors

Error Message Cause Solution
`Python versions prior to 3.9 are not supported for PyFlink.` Python < 3.9 detected Upgrade to Python 3.9, 3.10, 3.11, or 3.12
`ModuleNotFoundError: No module named 'pemja'` pemja not available on Windows Use Linux or macOS; pemja is a non-Windows-only dependency
`cython: command not found` Cython build dependency missing `pip install cython>=0.29.24`
`fastavro 1.8.0 compatibility error` Known incompatible fastavro version Use any fastavro version except 1.8.0

Compatibility Notes

  • Python 3.8 and below: Not supported. Hard exit in setup.py.
  • Python 3.13+: Not yet in the supported wheel build matrix.
  • Windows: The `pemja` package (Java-Python bridge) is excluded on Windows.
  • fastavro 1.8.0: Explicitly excluded due to a known compatibility issue.
  • Apache Beam: Pinned to a narrow range (2.54.0 to 2.61.0) for API stability.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment