Principle:Eventual Inc Daft Arrow FFI Compatibility
| Knowledge Sources | |
|---|---|
| Domains | Data_Interop, FFI |
| Last Updated | 2026-02-08 14:00 GMT |
Overview
Compatibility pattern that fixes known PyArrow data structure issues before transferring Arrow data through the C Data Interface (FFI) into a Rust engine.
Description
When passing Arrow data from Python (PyArrow) to Rust (arrow2) via the C Data Interface, certain edge cases cause data corruption or crashes. Arrow FFI Compatibility applies targeted fixes:
- Empty struct arrays: arrow2's FFI cannot handle StructArrays with zero fields. The fix adds a placeholder null-typed field and reverses the transformation on the return path.
- Slice offset propagation: PyArrow versions may drop struct and fixed-size list slice offsets during record batch conversion. The fix propagates offsets to child arrays by flattening, which is zero-copy when no validity bitmap is present.
Usage
Apply this principle at every Python-to-Rust data transfer boundary. The fix layer should be transparent to callers and should be the last transformation before FFI and the first after receiving data back.
Theoretical Basis
The Arrow C Data Interface provides zero-copy data sharing between language runtimes. However, the specification leaves certain edge cases implementation-defined:
- Empty structs: The spec does not mandate that implementations handle zero-field structs, so a placeholder sentinel is needed.
- Slice offsets: Array slices store an offset into the underlying buffer. Some implementations fail to propagate this offset to nested child arrays during FFI export, requiring explicit flattening to materialize the slice.