Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Apache Paimon RowType

From Leeroopedia


Knowledge Sources
Domains Type System, Schema Definition
Last Updated 2026-02-08 00:00 GMT

Overview

RowType defines the row data type for Paimon's type system, representing a structured sequence of named and typed fields that corresponds to a table's schema.

Description

RowType is the most important composite type in Paimon's type system, directly representing table schemas and nested structures. It extends DataType with DataTypeRoot.ROW and holds an immutable list of DataField instances, each containing a field ID, name, type, optional description, and optional default value. The class is annotated as @Public since version 0.4.0, indicating its stability as part of the Paimon public API.

The implementation provides lazy-initialized lookup maps for efficient field access by name (nameToField, nameToIndex) and by field ID (fieldIdToField, fieldIdToIndex). These maps are computed on first access and cached for subsequent queries. Schema validation occurs during construction, ensuring all field names are non-empty, non-whitespace, and unique, and that field IDs are not duplicated.

RowType supports comprehensive operations including field projection (extracting subsets of fields), JSON serialization, SQL string rendering, equality checking (both with and without field ID comparison), and a Builder pattern for constructing instances with auto-incrementing field IDs. The class includes utility methods for determining the highest field ID in a schema (used during schema evolution) and for creating RowType instances from various input formats.

Usage

Use RowType when defining table schemas, working with nested structures, implementing schema evolution logic, or when type-safe field access is required. The Builder pattern is preferred for constructing new instances with automatic field ID management.

Code Reference

Source Location

Signature

@Public
public final class RowType extends DataType {
    public RowType(boolean isNullable, List<DataField> fields);
    public RowType(List<DataField> fields);

    public List<DataField> getFields();
    public List<String> getFieldNames();
    public List<DataType> getFieldTypes();
    public DataType getTypeAt(int i);
    public int getFieldCount();
    public int getFieldIndex(String fieldName);
    public boolean containsField(String fieldName);
    public boolean containsField(int fieldId);
    public DataField getField(String fieldName);
    public DataField getField(int fieldId);

    public RowType project(int[] mapping);
    public RowType project(List<String> names);
    public RowType project(String... names);

    public static RowType.Builder builder();
    public static RowType.Builder builder(AtomicInteger fieldId);
    public static int currentHighestFieldId(List<DataField> fields);
}

Import

import org.apache.paimon.types.RowType;

I/O Contract

Inputs

Name Type Required Description
fields List<DataField> yes List of fields defining the row structure
isNullable boolean no Whether the row type itself can be null (default: true)

Outputs

Name Type Description
rowType RowType Immutable row type with validated fields
projectedType RowType New row type with subset of fields
fieldIndex int Index of field by name (-1 if not found)
field DataField Field definition by name or ID

Usage Examples

Creating RowType with Builder

// Using builder for automatic field ID assignment
RowType userType = RowType.builder()
    .field("user_id", DataTypes.BIGINT())
    .field("name", DataTypes.STRING(), "User full name")
    .field("email", DataTypes.STRING(), "Email address", "unknown@example.com")
    .field("age", DataTypes.INT())
    .build();

// Builder with explicit field ID counter
AtomicInteger idCounter = new AtomicInteger(10);
RowType customType = RowType.builder(idCounter)
    .field("id", DataTypes.BIGINT())
    .field("value", DataTypes.DOUBLE())
    .build();
// Fields will have IDs 11, 12

Creating RowType from DataFields

// Create fields explicitly
List<DataField> fields = Arrays.asList(
    new DataField(0, "id", DataTypes.BIGINT()),
    new DataField(1, "name", DataTypes.STRING()),
    new DataField(2, "created_at", DataTypes.TIMESTAMP(3))
);

// Create row type
RowType rowType = new RowType(fields);

// Create NOT NULL row type
RowType notNullType = new RowType(false, fields);

Field Access Operations

RowType rowType = RowType.builder()
    .field("id", DataTypes.BIGINT())
    .field("name", DataTypes.STRING())
    .field("status", DataTypes.INT())
    .build();

// Access by name
int nameIndex = rowType.getFieldIndex("name");  // Returns: 1
DataField nameField = rowType.getField("name");
boolean hasEmail = rowType.containsField("email");  // Returns: false

// Access by field ID
DataField fieldById = rowType.getField(0);  // Get "id" field
int indexById = rowType.getFieldIndexByFieldId(1);  // Get index of "name"

// Get field properties
List<String> fieldNames = rowType.getFieldNames();
List<DataType> fieldTypes = rowType.getFieldTypes();
int fieldCount = rowType.getFieldCount();
DataType statusType = rowType.getTypeAt(2);

Field Projection

RowType fullType = RowType.builder()
    .field("id", DataTypes.BIGINT())
    .field("name", DataTypes.STRING())
    .field("age", DataTypes.INT())
    .field("email", DataTypes.STRING())
    .build();

// Project by field names
RowType projected = fullType.project("id", "name");
// Result: RowType with only id and name fields

// Project by field name list
List<String> selectFields = Arrays.asList("name", "email");
RowType projected2 = fullType.project(selectFields);

// Project by indexes
int[] indexes = {0, 2};  // id and age
RowType projected3 = fullType.project(indexes);

// Get projection indexes
int[] projIndexes = fullType.projectIndexes(Arrays.asList("email", "id"));
// Returns: [3, 0]

Schema Validation and Comparison

// Validate schema
try {
    List<DataField> invalidFields = Arrays.asList(
        new DataField(0, "id", DataTypes.BIGINT()),
        new DataField(1, "id", DataTypes.STRING())  // Duplicate name
    );
    RowType invalid = new RowType(invalidFields);
} catch (IllegalArgumentException e) {
    System.err.println("Validation error: " + e.getMessage());
}

// Compare row types
RowType type1 = RowType.builder()
    .field("id", DataTypes.BIGINT())
    .field("name", DataTypes.STRING())
    .build();

RowType type2 = RowType.builder()
    .field("id", DataTypes.BIGINT())
    .field("name", DataTypes.STRING())
    .build();

// Exact equality (including field IDs)
boolean equal = type1.equals(type2);

// Equality ignoring field IDs
boolean structuralEqual = type1.equalsIgnoreFieldId(type2);

// Check if pruned from another type
boolean isPruned = type1.isPrunedFrom(type2);

Schema Evolution Support

// Get highest field ID for evolution
List<DataField> currentFields = Arrays.asList(
    new DataField(0, "id", DataTypes.BIGINT()),
    new DataField(1, "name", DataTypes.STRING()),
    new DataField(5, "added_field", DataTypes.INT())
);
RowType currentType = new RowType(currentFields);

int highestId = RowType.currentHighestFieldId(currentFields);
// Returns: 5

// Add new field with next ID
int nextId = highestId + 1;  // 6
DataField newField = new DataField(nextId, "new_column", DataTypes.DOUBLE());

Serialization and Display

RowType rowType = RowType.builder()
    .field("id", DataTypes.BIGINT())
    .field("data", DataTypes.STRING())
    .build();

// Get SQL string representation
String sqlString = rowType.asSQLString();
// Returns: "ROW<id BIGINT, data STRING>"

// Get NOT NULL SQL string
String notNullSql = rowType.notNull().asSQLString();
// Returns: "ROW<id BIGINT, data STRING> NOT NULL"

// Serialize to JSON
String json = rowType.serializeJson();

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment