Implementation:Lance format Lance Java MergeInsertParams
| Knowledge Sources | |
|---|---|
| Domains | Java_SDK, Dataset_Management |
| Last Updated | 2026-02-08 19:33 GMT |
Overview
The MergeInsertParams class configures the behavior of merge-insert (upsert) operations on a Lance dataset, defining how matched, unmatched, and source-only rows are handled.
Description
MergeInsertParams is a mutable configuration object that specifies the semantics of a merge-insert operation performed via Dataset.mergeInsert(). It implements a fluent API pattern for configuring three behavioral axes:
WhenMatched (source row matches target row):
UpdateAll: Replace the target row with the source row (upsert)DoNothing: Keep the target row unchanged (find-or-create)UpdateIf: Update only when a condition expression evaluates to trueDelete: Delete the matched target rowFail: Abort the operation if any match is found
WhenNotMatched (source row has no match in target):
InsertAll: Insert the new row into the targetDoNothing: Ignore the unmatched source row
WhenNotMatchedBySource (target row has no match in source):
Keep: Retain the target rowDelete: Delete all unmatched target rowsDeleteIf: Delete unmatched target rows where a condition is true (SQL or Substrait expression)
Additional operational parameters:
- conflictRetries: Number of retry attempts on contention (default: 10)
- retryTimeoutMs: Maximum time for retries in milliseconds (default: 30000)
- skipAutoCleanup: Skip automatic cleanup during commits for improved write performance
Usage
Use MergeInsertParams when you need to perform upsert, find-or-create, or conditional merge operations against a Lance dataset. This is the primary mechanism for synchronizing external data with existing dataset content.
Code Reference
Source Location
| Property | Value |
|---|---|
| File | java/src/main/java/org/lance/merge/MergeInsertParams.java
|
| Package | org.lance.merge
|
| Lines | 354 |
Signature
public class MergeInsertParams
Import
import org.lance.merge.MergeInsertParams;
I/O Contract
Constructor
| Constructor | Input | Description |
|---|---|---|
MergeInsertParams(List<String>) |
on columns | Column names to match source and target rows |
Configuration Methods (Input)
| Method | Parameter | Description |
|---|---|---|
withMatchedUpdateAll() |
none | Replace target row with source on match (upsert) |
withMatchedDoNothing() |
none | Keep target row unchanged on match (find-or-create) |
withMatchedDelete() |
none | Delete target row on match |
withMatchedUpdateIf(String) |
SQL expression | Conditional update on match |
withMatchedFail() |
none | Fail operation on any match |
withNotMatched(WhenNotMatched) |
enum value | Action for unmatched source rows |
withNotMatchedBySourceKeep() |
none | Keep unmatched target rows |
withNotMatchedBySourceDelete() |
none | Delete all unmatched target rows |
withNotMatchedBySourceDeleteIf(String) |
SQL expression | Conditionally delete unmatched target rows |
withNotMatchedBySourceDeleteSubstraitIf(ByteBuffer) |
Substrait expression | Conditional delete via Substrait |
withConflictRetries(int) |
retry count | Number of contention retries (default: 10) |
withRetryTimeoutMs(long) |
timeout in ms | Max retry duration (default: 30000) |
withSkipAutoCleanup(boolean) |
flag | Skip auto cleanup for high-frequency writes |
Accessor Methods (Output)
| Method | Return Type | Description |
|---|---|---|
on() |
List<String> |
Join key columns |
whenMatched() |
WhenMatched |
Current matched behavior |
whenNotMatched() |
WhenNotMatched |
Current not-matched behavior |
whenNotMatchedBySource() |
WhenNotMatchedBySource |
Current not-matched-by-source behavior |
conflictRetries() |
int |
Configured retry count |
retryTimeoutMs() |
long |
Configured retry timeout |
skipAutoCleanup() |
boolean |
Whether auto cleanup is skipped |
Usage Examples
Upsert (Update or Insert)
import org.lance.merge.MergeInsertParams;
import java.util.Arrays;
MergeInsertParams params = new MergeInsertParams(Arrays.asList("id"))
.withMatchedUpdateAll()
.withNotMatched(MergeInsertParams.WhenNotMatched.InsertAll);
MergeInsertResult result = dataset.mergeInsert(params, sourceStream);
Find or Create
import org.lance.merge.MergeInsertParams;
import java.util.Arrays;
MergeInsertParams params = new MergeInsertParams(Arrays.asList("email"))
.withMatchedDoNothing()
.withNotMatched(MergeInsertParams.WhenNotMatched.InsertAll);
MergeInsertResult result = dataset.mergeInsert(params, sourceStream);
Conditional Update with Region Replace
import org.lance.merge.MergeInsertParams;
import java.util.Arrays;
MergeInsertParams params = new MergeInsertParams(Arrays.asList("id"))
.withMatchedUpdateIf("source.updated_at > target.updated_at")
.withNotMatched(MergeInsertParams.WhenNotMatched.InsertAll)
.withNotMatchedBySourceDeleteIf("target.region = 'us-east-1'");
MergeInsertResult result = dataset.mergeInsert(params, sourceStream);
High-Frequency Write Optimization
import org.lance.merge.MergeInsertParams;
import java.util.Arrays;
MergeInsertParams params = new MergeInsertParams(Arrays.asList("id"))
.withMatchedUpdateAll()
.withNotMatched(MergeInsertParams.WhenNotMatched.InsertAll)
.withConflictRetries(20)
.withRetryTimeoutMs(60000)
.withSkipAutoCleanup(true);
Related Pages
- Lance_format_Lance_Java_Dataset - Dataset class that executes merge-insert operations
- Lance_format_Lance_Java_WriteDatasetBuilder - Alternative write approach for full dataset creation
- Lance_format_Lance_Java_ScanOptions - Scan options for reading data before/after merges