Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Apache Paimon KeyValue

From Leeroopedia
Revision as of 14:21, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Apache_Paimon_KeyValue.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Knowledge Sources
Domains Primary Key Tables, LSM Tree
Last Updated 2026-02-08 00:00 GMT

Overview

KeyValue represents a key-value pair in primary key tables including the user key, sequence number, value kind, and value data.

Description

The KeyValue class provides an efficient representation for key-value records in Apache Paimon's LSM-tree storage. It wraps a tuple containing key fields, sequence number, value kind byte, and value fields, providing convenient accessor properties for each component.

The class uses reusable OffsetRow objects to access key and value portions of the underlying tuple without creating new objects for each access. This optimization reduces memory allocations and improves performance in high-throughput scenarios.

KeyValue supports checking whether a record is an addition (INSERT or UPDATE_AFTER) through the is_add method, which operates directly on the value kind byte without constructing RowKind enum instances. The replace method allows reusing a KeyValue instance with different tuple data.

Usage

Use KeyValue when working with primary key tables, implementing merge operations, reading LSM-tree data files, or processing changelog streams where key-value semantics are required.

Code Reference

Source Location

Signature

class KeyValue:
    """A key value, including user key, sequence number, value kind and value."""

    def __init__(self, key_arity: int, value_arity: int):
        """Initialize with key and value field counts."""

    def replace(self, row_tuple: tuple):
        """Replace underlying tuple data and return self."""

    def is_add(self) -> bool:
        """Check if this is an add operation (INSERT or UPDATE_AFTER)."""

    @property
    def key(self) -> OffsetRow:
        """Get key portion as OffsetRow."""

    @property
    def value(self) -> OffsetRow:
        """Get value portion as OffsetRow."""

    @property
    def sequence_number(self) -> int:
        """Get sequence number."""

    @property
    def value_row_kind_byte(self) -> int:
        """Get value kind as byte."""

Import

from pypaimon.table.row.key_value import KeyValue

I/O Contract

Inputs

Name Type Required Description
key_arity int Yes Number of key fields
value_arity int Yes Number of value fields
row_tuple tuple Yes Row data tuple

Outputs

Name Type Description
key OffsetRow Key portion of the record
value OffsetRow Value portion of the record
sequence_number int Sequence number for ordering
value_row_kind_byte int Value kind as byte (0=INSERT, 2=UPDATE_AFTER, etc.)
is_add bool True if INSERT or UPDATE_AFTER

Usage Examples

from pypaimon.table.row.key_value import KeyValue

# Create KeyValue for a table with 2 key fields and 3 value fields
kv = KeyValue(key_arity=2, value_arity=3)

# Tuple structure: (key_field_0, key_field_1, sequence_number, value_kind_byte,
#                    value_field_0, value_field_1, value_field_2)
row_tuple = (1, "partition1", 100, 0, "Alice", 30, "USA")

# Replace tuple data
kv.replace(row_tuple)

# Access key
key = kv.key
print(f"Key field 0: {key.get_field(0)}")  # 1
print(f"Key field 1: {key.get_field(1)}")  # "partition1"

# Access value
value = kv.value
print(f"Value field 0: {value.get_field(0)}")  # "Alice"
print(f"Value field 1: {value.get_field(1)}")  # 30
print(f"Value field 2: {value.get_field(2)}")  # "USA"

# Get sequence number
seq_num = kv.sequence_number
print(f"Sequence: {seq_num}")  # 100

# Check if add operation
if kv.is_add():
    print("This is an insert or update_after")

# Get raw value kind byte
kind_byte = kv.value_row_kind_byte
print(f"Kind byte: {kind_byte}")  # 0 for INSERT

# Reuse KeyValue instance
new_tuple = (2, "partition2", 101, 3, "Bob", 25, "UK")
kv.replace(new_tuple)

# Process stream of KeyValue records
def merge_records(records: list):
    kv = KeyValue(key_arity=2, value_arity=3)
    for row_tuple in records:
        kv.replace(row_tuple)
        if kv.is_add():
            # Process addition
            print(f"Add key {kv.key} with value {kv.value}")
        else:
            # Process deletion
            print(f"Delete key {kv.key}")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment