Implementation:Apache Paimon InternalRow
| Knowledge Sources | |
|---|---|
| Domains | Row Representation, Data Abstraction |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
InternalRow is an abstract base interface defining the contract for internal data structures representing rows with RowKind support.
Description
The InternalRow class provides the fundamental abstraction for row data in Apache Paimon's internal representation. It defines the interface that all row implementations must follow, ensuring consistent access patterns regardless of the underlying storage format.
The interface requires implementations to provide field access by position, row kind retrieval for changelog semantics, and length determination. It deliberately separates the RowKind from field count, as RowKind is metadata about the row's change type rather than actual data.
InternalRow implementations include BinaryRow for compact binary storage, GenericRow for in-memory Python object storage, and OffsetRow for efficient access to portions of larger row structures. The base class provides a default string representation that concatenates field values.
Usage
Use InternalRow as the interface type when writing code that works with different row representations, or extend it when implementing custom row storage formats that need to integrate with Paimon's row processing pipeline.
Code Reference
Source Location
- Repository: Apache_Paimon
- File: paimon-python/pypaimon/table/row/internal_row.py
Signature
class InternalRow(ABC):
"""Base interface for an internal data structure representing data of RowType."""
@abstractmethod
def get_field(self, pos: int) -> Any:
"""Returns the value at the given position."""
@abstractmethod
def get_row_kind(self) -> RowKind:
"""Returns the kind of change that this row describes in a changelog."""
@abstractmethod
def __len__(self) -> int:
"""Returns the number of fields in this row.
The number does not include RowKind. It is kept separately.
"""
def __str__(self) -> str:
"""String representation of the row."""
Import
from pypaimon.table.row.internal_row import InternalRow
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| pos | int | Yes | Field position/index |
Outputs
| Name | Type | Description |
|---|---|---|
| field_value | Any | Value at the specified position |
| row_kind | RowKind | Change type (INSERT, UPDATE_BEFORE, UPDATE_AFTER, DELETE) |
| length | int | Number of fields in the row |
Usage Examples
from pypaimon.table.row.internal_row import InternalRow
from pypaimon.table.row.row_kind import RowKind
# Using InternalRow interface with different implementations
def process_row(row: InternalRow):
"""Process any InternalRow implementation."""
row_kind = row.get_row_kind()
if row_kind.is_add():
# Process insert or update_after
for i in range(len(row)):
value = row.get_field(i)
print(f"Field {i}: {value}")
else:
# Handle delete or update_before
print(f"Delete/Update before: {row}")
# Works with BinaryRow
from pypaimon.table.row.binary_row import BinaryRow
binary_row = BinaryRow(binary_data, fields)
process_row(binary_row)
# Works with GenericRow
from pypaimon.table.row.generic_row import GenericRow
generic_row = GenericRow([1, "test", 42], fields, RowKind.INSERT)
process_row(generic_row)
# Implement custom row type
class CustomRow(InternalRow):
def __init__(self, data):
self._data = data
def get_field(self, pos: int):
return self._data[pos]
def get_row_kind(self) -> RowKind:
return RowKind.INSERT
def __len__(self) -> int:
return len(self._data)
custom = CustomRow([1, 2, 3])
process_row(custom)
# String representation
row = GenericRow([1, "Alice", 30], fields, RowKind.INSERT)
print(str(row)) # "1 Alice 30"