Implementation:Apache Paimon Run Mixed Tests Script
| Knowledge Sources | |
|---|---|
| Domains | Testing, Cross-Language Interoperability |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
run_mixed_tests.sh is an end-to-end test orchestrator that validates Java-Python data interoperability by running bidirectional read/write tests across multiple file formats (Parquet, ORC, Avro, Lance) and advanced features (deletion vectors, B-tree indexes, FAISS vector search).
Description
This script executes a comprehensive test sequence in seven stages: (1) Java writes data in Parquet/ORC/Avro and Lance formats via Maven, (2) Python reads that data via pytest to verify compatibility, (3) Python writes data using the Python SDK, (4) Java reads Python-written data to verify reverse compatibility, (5) tests primary key tables with deletion vectors, (6) tests FAISS vector index reading (skipped on Python 3.6 due to limited faiss-cpu support), and (7) tests B-tree global index reading. Each stage runs independently with isolated error handling, allowing subsequent stages to execute even if earlier stages fail. The script uses colored terminal output (red/green/yellow) for immediate feedback and provides a comprehensive summary at the end. A warehouse directory cleanup step runs after all tests complete. The script validates that data written in one language can be correctly read in the other, which is essential since Paimon tables are shared between Java and Python runtimes in production data lake environments.
Python version detection is used to conditionally skip FAISS tests on older Python versions, and Maven output is captured to detect if tests were skipped due to missing native libraries.
Usage
This script is invoked by the lint-python.sh orchestrator and can be run standalone for comprehensive interoperability validation.
Code Reference
Source Location
- Repository: Apache_Paimon
- File: paimon-python/dev/run_mixed_tests.sh
Signature
#!/bin/bash
# Key functions
function cleanup_warehouse() { ... }
function run_java_write_test() { ... }
function run_python_read_test() { ... }
function run_python_write_test() { ... }
function run_java_read_test() { ... }
function run_pk_dv_test() { ... }
function run_faiss_vector_test() { ... }
function run_btree_index_test() { ... }
function main() { ... }
Import
# Run from project root
cd /path/to/paimon
./paimon-python/dev/run_mixed_tests.sh
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| Python version | Environment | yes | Auto-detected to skip incompatible tests |
| Maven | Binary | yes | Used to run Java tests |
| Pytest | Binary | yes | Used to run Python tests |
Outputs
| Name | Type | Description |
|---|---|---|
| Exit code | Integer | 0 if all tests pass, 1 if any fail |
| Terminal output | stdout/stderr | Colored status messages and test results |
| Warehouse directory | Directory | Test data in `pypaimon/tests/e2e/warehouse` (cleaned after tests) |
Usage Examples
Run All Mixed Tests
# Run from paimon-python directory
cd paimon-python
./dev/run_mixed_tests.sh
# Expected output:
# === Mixed Java-Python Read Write Test Runner ===
# Project root: /path/to/paimon
# ...
# === Step 1: Running Java Write Tests (Parquet/Orc/Avro + Lance) ===
# ✓ Java write Parquet/Orc/Avro test completed successfully
# ✓ Java write lance test completed successfully
# ...
# 🎉 All tests passed! Java-Python interoperability verified.
Manual Cleanup
# Clean warehouse manually if tests are interrupted
rm -rf paimon-python/pypaimon/tests/e2e/warehouse
CI Integration
# Run via lint-python.sh
./dev/lint-python.sh -i mixed
# Or directly
./dev/run_mixed_tests.sh