Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Apache Paimon InternalRow

From Leeroopedia
Revision as of 14:21, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Apache_Paimon_InternalRow.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Knowledge Sources
Domains Row Representation, Data Abstraction
Last Updated 2026-02-08 00:00 GMT

Overview

InternalRow is an abstract base interface defining the contract for internal data structures representing rows with RowKind support.

Description

The InternalRow class provides the fundamental abstraction for row data in Apache Paimon's internal representation. It defines the interface that all row implementations must follow, ensuring consistent access patterns regardless of the underlying storage format.

The interface requires implementations to provide field access by position, row kind retrieval for changelog semantics, and length determination. It deliberately separates the RowKind from field count, as RowKind is metadata about the row's change type rather than actual data.

InternalRow implementations include BinaryRow for compact binary storage, GenericRow for in-memory Python object storage, and OffsetRow for efficient access to portions of larger row structures. The base class provides a default string representation that concatenates field values.

Usage

Use InternalRow as the interface type when writing code that works with different row representations, or extend it when implementing custom row storage formats that need to integrate with Paimon's row processing pipeline.

Code Reference

Source Location

Signature

class InternalRow(ABC):
    """Base interface for an internal data structure representing data of RowType."""

    @abstractmethod
    def get_field(self, pos: int) -> Any:
        """Returns the value at the given position."""

    @abstractmethod
    def get_row_kind(self) -> RowKind:
        """Returns the kind of change that this row describes in a changelog."""

    @abstractmethod
    def __len__(self) -> int:
        """Returns the number of fields in this row.
        The number does not include RowKind. It is kept separately.
        """

    def __str__(self) -> str:
        """String representation of the row."""

Import

from pypaimon.table.row.internal_row import InternalRow

I/O Contract

Inputs

Name Type Required Description
pos int Yes Field position/index

Outputs

Name Type Description
field_value Any Value at the specified position
row_kind RowKind Change type (INSERT, UPDATE_BEFORE, UPDATE_AFTER, DELETE)
length int Number of fields in the row

Usage Examples

from pypaimon.table.row.internal_row import InternalRow
from pypaimon.table.row.row_kind import RowKind

# Using InternalRow interface with different implementations
def process_row(row: InternalRow):
    """Process any InternalRow implementation."""
    row_kind = row.get_row_kind()

    if row_kind.is_add():
        # Process insert or update_after
        for i in range(len(row)):
            value = row.get_field(i)
            print(f"Field {i}: {value}")
    else:
        # Handle delete or update_before
        print(f"Delete/Update before: {row}")

# Works with BinaryRow
from pypaimon.table.row.binary_row import BinaryRow
binary_row = BinaryRow(binary_data, fields)
process_row(binary_row)

# Works with GenericRow
from pypaimon.table.row.generic_row import GenericRow
generic_row = GenericRow([1, "test", 42], fields, RowKind.INSERT)
process_row(generic_row)

# Implement custom row type
class CustomRow(InternalRow):
    def __init__(self, data):
        self._data = data

    def get_field(self, pos: int):
        return self._data[pos]

    def get_row_kind(self) -> RowKind:
        return RowKind.INSERT

    def __len__(self) -> int:
        return len(self._data)

custom = CustomRow([1, 2, 3])
process_row(custom)

# String representation
row = GenericRow([1, "Alice", 30], fields, RowKind.INSERT)
print(str(row))  # "1 Alice 30"

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment