Implementation:Lance format Lance JNI Optimize
| Knowledge Sources | |
|---|---|
| Domains | Java_Bindings, JNI |
| Last Updated | 2026-02-08 19:33 GMT |
Overview
JNI Optimize is the Rust-side JNI binding that exposes Lance dataset compaction planning and commitment to Java, enabling multi-step compaction workflows where planning, execution, and commit can be performed independently.
Description
This module provides JNI entry points for the Java Compaction class to perform dataset optimization through compaction. The compaction workflow is split into two phases:
Plan compaction:
Java_org_lance_compaction_Compaction_nativePlanCompaction- Analyzes the dataset and produces aCompactionPlandescribing which fragments should be rewritten. Accepts configurable options including target rows per fragment, max rows per group, max bytes per file, deletion materialization settings, thread count, batch size, and index remap deferral.
Commit compaction:
Java_org_lance_compaction_Compaction_nativeCommitCompaction- Takes a list ofRewriteResultobjects (produced by executing the compaction plan) and commits them to the dataset. Uses the same compaction options as the planning phase and returnsCompactionMetricswith statistics about the operation.
Both entry points:
- Extract the
BlockingDatasetfrom the Java dataset object via the native handle. - Build
CompactionOptionsfrom Java Optional parameters usingbuild_compaction_options. - Execute the operation on the Tokio runtime via
RT.block_on. - Convert results to Java objects via
IntoJava.
The commit phase additionally imports RewriteResult objects from Java, which contain the mapping between old and new fragments, and optionally remaps indices.
Usage
Use this module when implementing or extending the Java compaction API. The two-phase design allows Java applications to plan compaction, optionally execute tasks in parallel across distributed workers, and then commit the results back to the dataset.
Code Reference
Source Location
java/lance-jni/src/optimize.rs
Signature
// Plan compaction
#[no_mangle]
pub extern "system" fn Java_org_lance_compaction_Compaction_nativePlanCompaction<'local>(
mut env: JNIEnv<'local>,
_obj: JObject,
java_dataset: JObject,
target_rows_per_fragment: JObject, // Optional<Long>
max_rows_per_group: JObject, // Optional<Long>
max_bytes_per_file: JObject, // Optional<Long>
materialize_deletions: JObject, // Optional<Boolean>
materialize_deletions_threshold: JObject, // Optional<Float>
num_threads: JObject, // Optional<Long>
batch_size: JObject, // Optional<Long>
defer_index_remap: JObject, // Optional<Boolean>
) -> JObject<'local>;
// Commit compaction
#[no_mangle]
pub extern "system" fn Java_org_lance_compaction_Compaction_nativeCommitCompaction<'local>(
mut env: JNIEnv<'local>,
_obj: JObject,
java_dataset: JObject,
rewrite_results: JObject, // List<RewriteResult>
// ... same options as plan
) -> JObject<'local>;
Import
// This module is called directly via JNI; no crate-level import needed
I/O Contract
| Direction | Type | Description |
|---|---|---|
| Input | JObject (Java Dataset) |
Dataset to compact, carrying a native BlockingDataset handle
|
| Input | JObject (Optional Long) |
Target rows per fragment, max rows per group, max bytes per file, thread count, batch size |
| Input | JObject (Optional Boolean) |
Materialize deletions flag, defer index remap flag |
| Input | JObject (Optional Float) |
Materialize deletions threshold |
| Input | JObject (List of RewriteResult) |
Results from executing compaction tasks (commit phase only) |
| Output | JObject (Java CompactionPlan) |
Plan describing tasks to execute (plan phase) |
| Output | JObject (Java CompactionMetrics) |
Statistics about the compaction operation (commit phase) |
Usage Examples
// Java side: two-phase compaction
import org.lance.Dataset;
import org.lance.compaction.Compaction;
Dataset dataset = Dataset.open("/path/to/dataset");
// Phase 1: Plan
CompactionPlan plan = Compaction.planCompaction(dataset,
Optional.of(1_000_000L), // target rows per fragment
Optional.empty(), // max rows per group
Optional.empty(), // max bytes per file
Optional.of(true), // materialize deletions
Optional.of(0.1f), // materialize deletions threshold
Optional.of(4L), // num threads
Optional.empty(), // batch size
Optional.of(false) // defer index remap
);
// Phase 2: Execute tasks and collect results
List<RewriteResult> results = executeTasks(plan.getTasks());
// Phase 3: Commit
CompactionMetrics metrics = Compaction.commitCompaction(dataset, results, /* options */);
Related Pages
- Lance_format_Lance_JNI_BlockingDataset - Dataset that is being compacted
- Lance_format_Lance_JNI_Utils -
build_compaction_optionsutility used to parse compaction parameters - Lance_format_Lance_JNI_Traits - Type conversion traits for compaction result objects
- Lance_format_Lance_JNI_Transaction - Compaction results may interact with transaction operations