Environment:Apache Paimon Python Core Runtime
| Knowledge Sources | |
|---|---|
| Domains | Infrastructure, Data_Engineering |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
Python 3.6+ runtime environment with PyArrow, Pandas, fastavro, and zstandard as core dependencies for the PyPaimon SDK.
Description
This environment defines the core Python runtime and mandatory package dependencies required to run any PyPaimon operation. The SDK supports Python 3.6 through 3.11, with different dependency version constraints per Python version. Key packages include PyArrow for columnar data handling, Pandas/Polars for DataFrame integration, fastavro for manifest file reading, and zstandard/cramjam for compression. Python 3.6 requires special compatibility patches (fastavro zstd block reader) applied at import time.
Usage
Use this environment for all PyPaimon workflows including table read/write, schema operations, catalog interactions, and data format handling. This is the mandatory base prerequisite for every Implementation in the Apache Paimon Python SDK.
System Requirements
| Category | Requirement | Notes |
|---|---|---|
| OS | Linux, macOS, Windows (WSL) | Cross-platform Python package |
| Python | >= 3.6, <= 3.11 | Tested: 3.6, 3.7, 3.8, 3.9, 3.10, 3.11 |
| Disk | 500MB+ | For package installation and temporary files |
Dependencies
Python Packages (Python 3.6)
- `pyarrow` >= 6, < 7
- `pandas` >= 1.1, < 2
- `polars` >= 0.9, < 1
- `cachetools` >= 4.2, < 6
- `dataclasses` >= 0.8
- `fastavro` >= 1.4, < 2
- `fsspec` >= 2021.10, < 2026
- `ossfs` >= 2021.8
- `packaging` >= 21, < 26
- `pyroaring`
- `readerwriterlock` >= 1, < 2
- `zstandard` >= 0.19, < 1
Python Packages (Python 3.8+)
- `pyarrow` >= 16, < 20
- `pandas` >= 1.5, < 3 (for 3.9+), >= 1.3, < 3 (for 3.7-3.8)
- `polars` >= 1, < 2
- `cachetools` >= 5, < 6
- `fastavro` >= 1.4, < 2
- `fsspec` >= 2023, < 2026
- `ossfs` >= 2023
- `packaging` >= 21, < 26
- `pylance` >= 0.20, < 1 (for 3.9+), >= 0.10, < 1 (for 3.8)
- `pyroaring`
- `readerwriterlock` >= 1, < 2
- `zstandard` >= 0.19, < 1
- `cramjam` >= 1.3.0, < 3
Credentials
No credentials required for the core runtime. Storage-specific credentials are defined in the Environment:Apache_Paimon_Cloud_Storage_Credentials environment.
Quick Install
# Install core package
pip install pypaimon
# Or install from source with all dependencies
pip install pyarrow>=16 pandas>=1.5 fastavro>=1.4 zstandard>=0.19 pyroaring packaging>=21 fsspec>=2023 polars>=1 readerwriterlock>=1 cachetools>=5 cramjam>=1.3.0
Code Evidence
Python version check and compatibility patch from `pypaimon/__init__.py:19-23`:
if sys.version_info[:2] == (3, 6):
try:
from pypaimon.manifest import fastavro_py36_compat # noqa: F401
except ImportError:
pass
Python version requirement from `setup.py:90`:
python_requires=">=3.6",
PyArrow version detection from `pypaimon/filesystem/pyarrow_file_io.py:45-46`:
self._pyarrow_gte_7 = parse(pyarrow.__version__) >= parse("7.0.0")
self._pyarrow_gte_8 = parse(pyarrow.__version__) >= parse("8.0.0")
Python 3.6 ORC write limitation from `pypaimon/filesystem/local_file_io.py:320`:
if sys.version_info[:2] == (3, 6):
orc.write_table(data, f, **kwargs)
else:
orc.write_table(data, f, compression=compression, **kwargs)
Common Errors
| Error Message | Cause | Solution |
|---|---|---|
| `ImportError: No module named 'dataclasses'` | Python 3.6 missing backport | `pip install dataclasses>=0.8` |
| `ImportError: cannot import name 'fastavro_py36_compat'` | Python 3.6 zstd patch not loaded | Install `zstandard>=0.19`; this is non-fatal (caught by try/except) |
| ORC write fails on Python 3.6 | Compression parameter not supported | Upgrade to Python 3.8+ for full ORC compression support |
| `ModuleNotFoundError: No module named 'cramjam'` | Missing compression library on Python 3.7+ | `pip install cramjam>=1.3.0` |
Compatibility Notes
- Python 3.6: Requires fastavro zstd compatibility patch. ORC writes do not support compression parameter. `dataclasses` backport required. PyArrow limited to 6.x series.
- Python 3.7: Transitional version. Most features work but some optional dependencies (Ray, pylance) may have limited support.
- Python 3.8+: Full feature support. PyArrow 16+ required for latest API features. `cramjam` required for additional compression codec support.
- PyArrow < 7.0: OSS endpoint handling differs (bucket prepended to endpoint). Missing `force_virtual_addressing` parameter.
- PyArrow < 8.0: S3 retry strategy (`AwsStandardS3RetryStrategy`) not available.
Related Pages
- Implementation:Apache_Paimon_CatalogFactory_Create
- Implementation:Apache_Paimon_Catalog_Create_Database_and_Table
- Implementation:Apache_Paimon_BatchTableWrite_Write_Arrow
- Implementation:Apache_Paimon_TableCommit_Commit
- Implementation:Apache_Paimon_ReadBuilder_Scan
- Implementation:Apache_Paimon_TableRead_To_Arrow
- Implementation:Apache_Paimon_Schema_From_Pyarrow
- Implementation:Apache_Paimon_PyarrowFieldParser_From_Paimon_Schema
- Implementation:Apache_Paimon_Schema_With_Lance_Format
- Implementation:Apache_Paimon_BatchTableWrite_Write_Pandas
- Implementation:Apache_Paimon_TableRead_Multi_Format
- Implementation:Apache_Paimon_Schema_With_Blob_Options
- Implementation:Apache_Paimon_BlobDescriptor_Create_and_Serialize
- Implementation:Apache_Paimon_BlobFormatWriter_Write
- Implementation:Apache_Paimon_BlobDescriptor_Deserialize
- Implementation:Apache_Paimon_Blob_From_Descriptor
- Implementation:Apache_Paimon_PredicateBuilder_Filtering
- Implementation:Apache_Paimon_ReadBuilder_With_Projection