Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Apache Paimon Run Mixed Tests Script

From Leeroopedia


Knowledge Sources
Domains Testing, Cross-Language Interoperability
Last Updated 2026-02-08 00:00 GMT

Overview

run_mixed_tests.sh is an end-to-end test orchestrator that validates Java-Python data interoperability by running bidirectional read/write tests across multiple file formats (Parquet, ORC, Avro, Lance) and advanced features (deletion vectors, B-tree indexes, FAISS vector search).

Description

This script executes a comprehensive test sequence in seven stages: (1) Java writes data in Parquet/ORC/Avro and Lance formats via Maven, (2) Python reads that data via pytest to verify compatibility, (3) Python writes data using the Python SDK, (4) Java reads Python-written data to verify reverse compatibility, (5) tests primary key tables with deletion vectors, (6) tests FAISS vector index reading (skipped on Python 3.6 due to limited faiss-cpu support), and (7) tests B-tree global index reading. Each stage runs independently with isolated error handling, allowing subsequent stages to execute even if earlier stages fail. The script uses colored terminal output (red/green/yellow) for immediate feedback and provides a comprehensive summary at the end. A warehouse directory cleanup step runs after all tests complete. The script validates that data written in one language can be correctly read in the other, which is essential since Paimon tables are shared between Java and Python runtimes in production data lake environments.

Python version detection is used to conditionally skip FAISS tests on older Python versions, and Maven output is captured to detect if tests were skipped due to missing native libraries.

Usage

This script is invoked by the lint-python.sh orchestrator and can be run standalone for comprehensive interoperability validation.

Code Reference

Source Location

Signature

#!/bin/bash

# Key functions
function cleanup_warehouse() { ... }
function run_java_write_test() { ... }
function run_python_read_test() { ... }
function run_python_write_test() { ... }
function run_java_read_test() { ... }
function run_pk_dv_test() { ... }
function run_faiss_vector_test() { ... }
function run_btree_index_test() { ... }
function main() { ... }

Import

# Run from project root
cd /path/to/paimon
./paimon-python/dev/run_mixed_tests.sh

I/O Contract

Inputs

Name Type Required Description
Python version Environment yes Auto-detected to skip incompatible tests
Maven Binary yes Used to run Java tests
Pytest Binary yes Used to run Python tests

Outputs

Name Type Description
Exit code Integer 0 if all tests pass, 1 if any fail
Terminal output stdout/stderr Colored status messages and test results
Warehouse directory Directory Test data in `pypaimon/tests/e2e/warehouse` (cleaned after tests)

Usage Examples

Run All Mixed Tests

# Run from paimon-python directory
cd paimon-python
./dev/run_mixed_tests.sh

# Expected output:
# === Mixed Java-Python Read Write Test Runner ===
# Project root: /path/to/paimon
# ...
# === Step 1: Running Java Write Tests (Parquet/Orc/Avro + Lance) ===
# ✓ Java write Parquet/Orc/Avro test completed successfully
# ✓ Java write lance test completed successfully
# ...
# 🎉 All tests passed! Java-Python interoperability verified.

Manual Cleanup

# Clean warehouse manually if tests are interrupted
rm -rf paimon-python/pypaimon/tests/e2e/warehouse

CI Integration

# Run via lint-python.sh
./dev/lint-python.sh -i mixed

# Or directly
./dev/run_mixed_tests.sh

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment