Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Lance format Lance Java MergeOp

From Leeroopedia


Knowledge Sources
Domains Java_SDK, Dataset_Management
Last Updated 2026-02-08 19:33 GMT

Overview

Description

The Merge class is an immutable operation that combines new data fragments with a specified schema to enable schema evolution and column modifications. It extends SchemaOperation (which implements Operation), inheriting schema management capabilities including Arrow C Data Interface export for JNI communication with the Rust backend.

The operation carries both a list of FragmentMetadata objects (representing the new or updated data) and an Arrow Schema describing the target schema after the merge. This dual payload enables adding new columns to an existing dataset by providing fragments containing the new column data alongside the updated schema.

Usage

Use Merge for schema evolution scenarios such as adding new columns to an existing dataset. The fragments contain the new column data, and the schema describes the complete post-merge schema. This is the primary mechanism for column-level modifications in the Lance Java SDK.

Code Reference

Source Location

java/src/main/java/org/lance/operation/Merge.java

Signature

public class Merge extends SchemaOperation {
    public static Builder builder();
    public List<FragmentMetadata> fragments();
    public Schema schema();       // inherited from SchemaOperation
    public long exportSchema(BufferAllocator allocator); // inherited
    public String name();         // returns "Merge"
}

Import

import org.lance.operation.Merge;

I/O Contract

Inputs
Parameter Type Required Description
fragments List<FragmentMetadata> Yes Fragment metadata for the new or updated data
schema org.apache.arrow.vector.types.pojo.Schema Yes The target Arrow schema after the merge
Outputs
Return Type Description
fragments() List<FragmentMetadata> The fragment metadata for the merged data
schema() Schema The target schema
exportSchema(allocator) long Memory address of the exported Arrow C schema for JNI
name() String Returns "Merge" for JNI dispatch

Usage Examples

// Add a new "embedding" column to an existing dataset
Schema updatedSchema = new Schema(List.of(
    existingField1, existingField2,
    new Field("embedding", FieldType.nullable(new ArrowType.FixedSizeList(128)), List.of(itemField))
));

List<FragmentMetadata> newColumnFragments = writeEmbeddingColumn(data);

Merge mergeOp = Merge.builder()
    .fragments(newColumnFragments)
    .schema(updatedSchema)
    .build();

String opName = mergeOp.name(); // "Merge"

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment