Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Lance format Lance Java MergeInsertParams

From Leeroopedia


Knowledge Sources
Domains Java_SDK, Dataset_Management
Last Updated 2026-02-08 19:33 GMT

Overview

The MergeInsertParams class configures the behavior of merge-insert (upsert) operations on a Lance dataset, defining how matched, unmatched, and source-only rows are handled.

Description

MergeInsertParams is a mutable configuration object that specifies the semantics of a merge-insert operation performed via Dataset.mergeInsert(). It implements a fluent API pattern for configuring three behavioral axes:

WhenMatched (source row matches target row):

  • UpdateAll: Replace the target row with the source row (upsert)
  • DoNothing: Keep the target row unchanged (find-or-create)
  • UpdateIf: Update only when a condition expression evaluates to true
  • Delete: Delete the matched target row
  • Fail: Abort the operation if any match is found

WhenNotMatched (source row has no match in target):

  • InsertAll: Insert the new row into the target
  • DoNothing: Ignore the unmatched source row

WhenNotMatchedBySource (target row has no match in source):

  • Keep: Retain the target row
  • Delete: Delete all unmatched target rows
  • DeleteIf: Delete unmatched target rows where a condition is true (SQL or Substrait expression)

Additional operational parameters:

  • conflictRetries: Number of retry attempts on contention (default: 10)
  • retryTimeoutMs: Maximum time for retries in milliseconds (default: 30000)
  • skipAutoCleanup: Skip automatic cleanup during commits for improved write performance

Usage

Use MergeInsertParams when you need to perform upsert, find-or-create, or conditional merge operations against a Lance dataset. This is the primary mechanism for synchronizing external data with existing dataset content.

Code Reference

Source Location

Property Value
File java/src/main/java/org/lance/merge/MergeInsertParams.java
Package org.lance.merge
Lines 354

Signature

public class MergeInsertParams

Import

import org.lance.merge.MergeInsertParams;

I/O Contract

Constructor

Constructor Input Description
MergeInsertParams(List<String>) on columns Column names to match source and target rows

Configuration Methods (Input)

Method Parameter Description
withMatchedUpdateAll() none Replace target row with source on match (upsert)
withMatchedDoNothing() none Keep target row unchanged on match (find-or-create)
withMatchedDelete() none Delete target row on match
withMatchedUpdateIf(String) SQL expression Conditional update on match
withMatchedFail() none Fail operation on any match
withNotMatched(WhenNotMatched) enum value Action for unmatched source rows
withNotMatchedBySourceKeep() none Keep unmatched target rows
withNotMatchedBySourceDelete() none Delete all unmatched target rows
withNotMatchedBySourceDeleteIf(String) SQL expression Conditionally delete unmatched target rows
withNotMatchedBySourceDeleteSubstraitIf(ByteBuffer) Substrait expression Conditional delete via Substrait
withConflictRetries(int) retry count Number of contention retries (default: 10)
withRetryTimeoutMs(long) timeout in ms Max retry duration (default: 30000)
withSkipAutoCleanup(boolean) flag Skip auto cleanup for high-frequency writes

Accessor Methods (Output)

Method Return Type Description
on() List<String> Join key columns
whenMatched() WhenMatched Current matched behavior
whenNotMatched() WhenNotMatched Current not-matched behavior
whenNotMatchedBySource() WhenNotMatchedBySource Current not-matched-by-source behavior
conflictRetries() int Configured retry count
retryTimeoutMs() long Configured retry timeout
skipAutoCleanup() boolean Whether auto cleanup is skipped

Usage Examples

Upsert (Update or Insert)

import org.lance.merge.MergeInsertParams;
import java.util.Arrays;

MergeInsertParams params = new MergeInsertParams(Arrays.asList("id"))
    .withMatchedUpdateAll()
    .withNotMatched(MergeInsertParams.WhenNotMatched.InsertAll);

MergeInsertResult result = dataset.mergeInsert(params, sourceStream);

Find or Create

import org.lance.merge.MergeInsertParams;
import java.util.Arrays;

MergeInsertParams params = new MergeInsertParams(Arrays.asList("email"))
    .withMatchedDoNothing()
    .withNotMatched(MergeInsertParams.WhenNotMatched.InsertAll);

MergeInsertResult result = dataset.mergeInsert(params, sourceStream);

Conditional Update with Region Replace

import org.lance.merge.MergeInsertParams;
import java.util.Arrays;

MergeInsertParams params = new MergeInsertParams(Arrays.asList("id"))
    .withMatchedUpdateIf("source.updated_at > target.updated_at")
    .withNotMatched(MergeInsertParams.WhenNotMatched.InsertAll)
    .withNotMatchedBySourceDeleteIf("target.region = 'us-east-1'");

MergeInsertResult result = dataset.mergeInsert(params, sourceStream);

High-Frequency Write Optimization

import org.lance.merge.MergeInsertParams;
import java.util.Arrays;

MergeInsertParams params = new MergeInsertParams(Arrays.asList("id"))
    .withMatchedUpdateAll()
    .withNotMatched(MergeInsertParams.WhenNotMatched.InsertAll)
    .withConflictRetries(20)
    .withRetryTimeoutMs(60000)
    .withSkipAutoCleanup(true);

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment