Implementation:Lance format Lance JNI Fragment
| Knowledge Sources | |
|---|---|
| Domains | Java_Bindings, JNI |
| Last Updated | 2026-02-08 19:33 GMT |
Overview
JNI Fragment is the Rust-side JNI binding that exposes Lance fragment operations to Java, including counting rows within a fragment, creating new fragments from Arrow data, and performing fragment-level merge and update operations.
Description
This module provides JNI entry points for the Java Fragment class to interact with Lance data fragments. A fragment is a partition of a Lance dataset that contains a subset of the data. The module supports:
Read operations:
Java_org_lance_Fragment_countRowsNative- Counts the number of rows in a specific fragment by fragment ID, delegating to the asyncfragment.count_rows()method on the Tokio runtime.
Write operations:
Java_org_lance_Fragment_createWithFfiArray- Creates new fragments from Arrow data passed via FFI pointers (FFI_ArrowArrayandFFI_ArrowSchema). Reconstructs aRecordBatchfrom the FFI data, then writes it as a new fragment.Java_org_lance_Fragment_createWithFfiStream- Creates new fragments from an Arrow record batch stream passed viaFFI_ArrowArrayStream.
Both creation methods accept extensive write parameters including max rows per file, max rows per group, max bytes per file, write mode, stable row IDs flag, data storage version, and storage options (including an optional storage options provider).
Helper types:
FragmentMergeResult- Holds a merged fragment and its resulting schema.FragmentUpdateResult- Holds an updated fragment and the list of modified field IDs.
The module relies on the extract_write_params utility to convert Java write parameters into Rust WriteParams, and uses FileFragment::create and StreamingWriteSource from the lance-datafusion crate to perform actual fragment writes.
Usage
Use this module when implementing or extending fragment-level operations in the Java SDK. Fragment creation is used during dataset writes and when building transactions that include new data files.
Code Reference
Source Location
java/lance-jni/src/fragment.rs
Signature
pub(crate) struct FragmentMergeResult {
fragment: Fragment,
schema: Schema,
}
pub(crate) struct FragmentUpdateResult {
updated_fragment: Fragment,
fields_modified: Vec<u32>,
}
// JNI entry points
pub extern "system" fn Java_org_lance_Fragment_countRowsNative(
mut env: JNIEnv, _jfragment: JObject,
jdataset: JObject, fragment_id: jlong,
) -> jint;
pub extern "system" fn Java_org_lance_Fragment_createWithFfiArray<'local>(
mut env: JNIEnv<'local>, _obj: JObject,
dataset_uri: JString, arrow_array_addr: jlong, arrow_schema_addr: jlong,
// ... write parameter objects
) -> JObject<'local>;
pub extern "system" fn Java_org_lance_Fragment_createWithFfiStream<'local>(
mut env: JNIEnv<'local>, _obj: JObject,
dataset_uri: JString, arrow_stream_addr: jlong,
// ... write parameter objects
) -> JObject<'local>;
Import
use crate::fragment::{FragmentMergeResult, FragmentUpdateResult};
I/O Contract
| Direction | Type | Description |
|---|---|---|
| Input | JObject (Java Dataset) |
Dataset containing the fragment |
| Input | jlong (fragment_id) |
ID of the fragment to operate on |
| Input | JString (dataset_uri) |
URI for creating new fragments |
| Input | jlong (arrow_array_addr) |
Memory address of an FFI_ArrowArray for write operations
|
| Input | jlong (arrow_schema_addr) |
Memory address of an FFI_ArrowSchema for write operations
|
| Input | JObject (write params) |
Optional write parameters (max rows, mode, storage version, etc.) |
| Output | jint |
Row count for count operations |
| Output | JObject (Java Fragment list) |
List of created fragment metadata objects |
Usage Examples
// Java side: creating fragments from data
import org.lance.Fragment;
List<Fragment> fragments = Fragment.create(
"s3://bucket/my-dataset",
arrowData,
Optional.of(1024), // max rows per file
Optional.empty(), // max rows per group
Optional.empty(), // max bytes per file
Optional.of("append"), // write mode
Optional.empty(), // enable stable row ids
Optional.empty(), // data storage version
storageOptions
);
// Rust JNI side: count rows in a fragment
fn inner_count_rows_native(
env: &mut JNIEnv,
jdataset: JObject,
fragment_id: jlong,
) -> Result<usize> {
let dataset = unsafe {
env.get_rust_field::<_, _, BlockingDataset>(jdataset, NATIVE_DATASET)
}?;
let fragment = dataset.inner.get_fragment(fragment_id as usize)
.ok_or_else(|| Error::input_error(format!("Fragment not found: {fragment_id}")))?;
let res = RT.block_on(fragment.count_rows(None))?;
Ok(res)
}
Related Pages
- Lance_format_Lance_JNI_BlockingDataset - Dataset that contains and manages fragments
- Lance_format_Lance_JNI_Transaction - Transactions reference fragments in their operations
- Lance_format_Lance_JNI_Utils - Write parameter extraction used by fragment creation
- Lance_format_Lance_JNI_Traits - Type conversion traits for fragment metadata objects