Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Lance format Lance JNI Optimize

From Leeroopedia


Knowledge Sources
Domains Java_Bindings, JNI
Last Updated 2026-02-08 19:33 GMT

Overview

JNI Optimize is the Rust-side JNI binding that exposes Lance dataset compaction planning and commitment to Java, enabling multi-step compaction workflows where planning, execution, and commit can be performed independently.

Description

This module provides JNI entry points for the Java Compaction class to perform dataset optimization through compaction. The compaction workflow is split into two phases:

Plan compaction:

  • Java_org_lance_compaction_Compaction_nativePlanCompaction - Analyzes the dataset and produces a CompactionPlan describing which fragments should be rewritten. Accepts configurable options including target rows per fragment, max rows per group, max bytes per file, deletion materialization settings, thread count, batch size, and index remap deferral.

Commit compaction:

  • Java_org_lance_compaction_Compaction_nativeCommitCompaction - Takes a list of RewriteResult objects (produced by executing the compaction plan) and commits them to the dataset. Uses the same compaction options as the planning phase and returns CompactionMetrics with statistics about the operation.

Both entry points:

  • Extract the BlockingDataset from the Java dataset object via the native handle.
  • Build CompactionOptions from Java Optional parameters using build_compaction_options.
  • Execute the operation on the Tokio runtime via RT.block_on.
  • Convert results to Java objects via IntoJava.

The commit phase additionally imports RewriteResult objects from Java, which contain the mapping between old and new fragments, and optionally remaps indices.

Usage

Use this module when implementing or extending the Java compaction API. The two-phase design allows Java applications to plan compaction, optionally execute tasks in parallel across distributed workers, and then commit the results back to the dataset.

Code Reference

Source Location

java/lance-jni/src/optimize.rs

Signature

// Plan compaction
#[no_mangle]
pub extern "system" fn Java_org_lance_compaction_Compaction_nativePlanCompaction<'local>(
    mut env: JNIEnv<'local>,
    _obj: JObject,
    java_dataset: JObject,
    target_rows_per_fragment: JObject,        // Optional<Long>
    max_rows_per_group: JObject,              // Optional<Long>
    max_bytes_per_file: JObject,              // Optional<Long>
    materialize_deletions: JObject,           // Optional<Boolean>
    materialize_deletions_threshold: JObject, // Optional<Float>
    num_threads: JObject,                     // Optional<Long>
    batch_size: JObject,                      // Optional<Long>
    defer_index_remap: JObject,               // Optional<Boolean>
) -> JObject<'local>;

// Commit compaction
#[no_mangle]
pub extern "system" fn Java_org_lance_compaction_Compaction_nativeCommitCompaction<'local>(
    mut env: JNIEnv<'local>,
    _obj: JObject,
    java_dataset: JObject,
    rewrite_results: JObject,                 // List<RewriteResult>
    // ... same options as plan
) -> JObject<'local>;

Import

// This module is called directly via JNI; no crate-level import needed

I/O Contract

Direction Type Description
Input JObject (Java Dataset) Dataset to compact, carrying a native BlockingDataset handle
Input JObject (Optional Long) Target rows per fragment, max rows per group, max bytes per file, thread count, batch size
Input JObject (Optional Boolean) Materialize deletions flag, defer index remap flag
Input JObject (Optional Float) Materialize deletions threshold
Input JObject (List of RewriteResult) Results from executing compaction tasks (commit phase only)
Output JObject (Java CompactionPlan) Plan describing tasks to execute (plan phase)
Output JObject (Java CompactionMetrics) Statistics about the compaction operation (commit phase)

Usage Examples

// Java side: two-phase compaction
import org.lance.Dataset;
import org.lance.compaction.Compaction;

Dataset dataset = Dataset.open("/path/to/dataset");

// Phase 1: Plan
CompactionPlan plan = Compaction.planCompaction(dataset,
    Optional.of(1_000_000L),  // target rows per fragment
    Optional.empty(),          // max rows per group
    Optional.empty(),          // max bytes per file
    Optional.of(true),         // materialize deletions
    Optional.of(0.1f),         // materialize deletions threshold
    Optional.of(4L),           // num threads
    Optional.empty(),          // batch size
    Optional.of(false)         // defer index remap
);

// Phase 2: Execute tasks and collect results
List<RewriteResult> results = executeTasks(plan.getTasks());

// Phase 3: Commit
CompactionMetrics metrics = Compaction.commitCompaction(dataset, results, /* options */);

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment