Implementation:Apache Paimon KeyValue
| Knowledge Sources | |
|---|---|
| Domains | Primary Key Tables, LSM Tree |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
KeyValue represents a key-value pair in primary key tables including the user key, sequence number, value kind, and value data.
Description
The KeyValue class provides an efficient representation for key-value records in Apache Paimon's LSM-tree storage. It wraps a tuple containing key fields, sequence number, value kind byte, and value fields, providing convenient accessor properties for each component.
The class uses reusable OffsetRow objects to access key and value portions of the underlying tuple without creating new objects for each access. This optimization reduces memory allocations and improves performance in high-throughput scenarios.
KeyValue supports checking whether a record is an addition (INSERT or UPDATE_AFTER) through the is_add method, which operates directly on the value kind byte without constructing RowKind enum instances. The replace method allows reusing a KeyValue instance with different tuple data.
Usage
Use KeyValue when working with primary key tables, implementing merge operations, reading LSM-tree data files, or processing changelog streams where key-value semantics are required.
Code Reference
Source Location
- Repository: Apache_Paimon
- File: paimon-python/pypaimon/table/row/key_value.py
Signature
class KeyValue:
"""A key value, including user key, sequence number, value kind and value."""
def __init__(self, key_arity: int, value_arity: int):
"""Initialize with key and value field counts."""
def replace(self, row_tuple: tuple):
"""Replace underlying tuple data and return self."""
def is_add(self) -> bool:
"""Check if this is an add operation (INSERT or UPDATE_AFTER)."""
@property
def key(self) -> OffsetRow:
"""Get key portion as OffsetRow."""
@property
def value(self) -> OffsetRow:
"""Get value portion as OffsetRow."""
@property
def sequence_number(self) -> int:
"""Get sequence number."""
@property
def value_row_kind_byte(self) -> int:
"""Get value kind as byte."""
Import
from pypaimon.table.row.key_value import KeyValue
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| key_arity | int | Yes | Number of key fields |
| value_arity | int | Yes | Number of value fields |
| row_tuple | tuple | Yes | Row data tuple |
Outputs
| Name | Type | Description |
|---|---|---|
| key | OffsetRow | Key portion of the record |
| value | OffsetRow | Value portion of the record |
| sequence_number | int | Sequence number for ordering |
| value_row_kind_byte | int | Value kind as byte (0=INSERT, 2=UPDATE_AFTER, etc.) |
| is_add | bool | True if INSERT or UPDATE_AFTER |
Usage Examples
from pypaimon.table.row.key_value import KeyValue
# Create KeyValue for a table with 2 key fields and 3 value fields
kv = KeyValue(key_arity=2, value_arity=3)
# Tuple structure: (key_field_0, key_field_1, sequence_number, value_kind_byte,
# value_field_0, value_field_1, value_field_2)
row_tuple = (1, "partition1", 100, 0, "Alice", 30, "USA")
# Replace tuple data
kv.replace(row_tuple)
# Access key
key = kv.key
print(f"Key field 0: {key.get_field(0)}") # 1
print(f"Key field 1: {key.get_field(1)}") # "partition1"
# Access value
value = kv.value
print(f"Value field 0: {value.get_field(0)}") # "Alice"
print(f"Value field 1: {value.get_field(1)}") # 30
print(f"Value field 2: {value.get_field(2)}") # "USA"
# Get sequence number
seq_num = kv.sequence_number
print(f"Sequence: {seq_num}") # 100
# Check if add operation
if kv.is_add():
print("This is an insert or update_after")
# Get raw value kind byte
kind_byte = kv.value_row_kind_byte
print(f"Kind byte: {kind_byte}") # 0 for INSERT
# Reuse KeyValue instance
new_tuple = (2, "partition2", 101, 3, "Bob", 25, "UK")
kv.replace(new_tuple)
# Process stream of KeyValue records
def merge_records(records: list):
kv = KeyValue(key_arity=2, value_arity=3)
for row_tuple in records:
kv.replace(row_tuple)
if kv.is_add():
# Process addition
print(f"Add key {kv.key} with value {kv.value}")
else:
# Process deletion
print(f"Delete key {kv.key}")