Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Apache Paimon SpecialFields

From Leeroopedia


Knowledge Sources
Domains Schema Management, System Fields
Last Updated 2026-02-08 00:00 GMT

Overview

SpecialFields defines system fields and structured type field ID allocation schemes used internally by Apache Paimon for metadata and nested type management.

Description

SpecialFields is a utility class that manages two categories of special fields in Paimon's type system. First, it defines system fields with reserved field IDs starting from Integer.MAX_VALUE/2, including _SEQUENCE_NUMBER (for versioning), _VALUE_KIND (for change types), _LEVEL (for LSM tree levels), _KEY_* fields (for primary keys), rowkind (for audit logs), and _ROW_ID (for row tracking). These fields carry internal metadata and are handled specially by the storage and query engines.

Second, it provides ID allocation functions for structured type fields (array elements, map keys, and map values) used primarily in Parquet file schemas. These IDs are computed deterministically based on a base value (Integer.MAX_VALUE/4), the parent field ID, and nesting depth, allowing compute engines to read nested fields directly by ID without schema traversal. The allocation supports up to 1024 depth levels per field, sufficient for deeply nested structures.

The class includes utility methods isSystemField() for checking if a field ID or name belongs to system fields, and helper methods like rowTypeWithRowTracking() to augment row types with row tracking fields (_ROW_ID and _SEQUENCE_NUMBER). These helpers manage field name conflicts and nullability configurations, essential for enabling row-level tracking features. The field allocation algorithms ensure deterministic, collision-free IDs across schema evolution.

Usage

Use SpecialFields when implementing storage formats requiring stable field IDs (like Parquet), when checking if fields are system-managed, when adding row tracking capabilities to tables, or when working with nested type field ID resolution.

Code Reference

Source Location

Signature

public class SpecialFields {

    // System field constants
    public static final int SYSTEM_FIELD_ID_START = Integer.MAX_VALUE / 2;
    public static final String KEY_FIELD_PREFIX = "_KEY_";
    public static final int KEY_FIELD_ID_START = SYSTEM_FIELD_ID_START;
    public static final DataField SEQUENCE_NUMBER;
    public static final DataField VALUE_KIND;
    public static final DataField LEVEL;
    public static final DataField ROW_KIND;
    public static final DataField ROW_ID;
    public static final Set<String> SYSTEM_FIELD_NAMES;

    // Structured type field constants
    public static final int STRUCTURED_TYPE_FIELD_ID_BASE = Integer.MAX_VALUE / 4;
    public static final int STRUCTURED_TYPE_FIELD_DEPTH_LIMIT = 1 << 10;

    // System field checks
    public static boolean isSystemField(int fieldId)
    public static boolean isSystemField(String field)
    public static boolean isKeyField(String field)

    // Structured type field ID allocation
    public static int getArrayElementFieldId(int arrayFieldId, int depth)
    public static int getMapKeyFieldId(int mapFieldId, int depth)
    public static int getMapValueFieldId(int mapFieldId, int depth)

    // Row type augmentation
    public static RowType rowTypeWithRowTracking(RowType rowType)
    public static RowType rowTypeWithRowTracking(RowType rowType,
                                                  boolean rowIdNullable,
                                                  boolean sequenceNumberNullable)
    public static RowType rowTypeWithRowId(RowType rowType)
}

Import

import org.apache.paimon.table.SpecialFields;

I/O Contract

Inputs

Name Type Required Description
fieldId int Context-dependent Field ID to check or use in calculations
field String Context-dependent Field name to check
rowType RowType For augmentation Row type to add tracking fields to
depth int For nested types Nesting depth for field ID calculation

Outputs

Name Type Description
Is system field boolean Whether the field is system-managed
Field ID int Calculated field ID for nested types
Augmented row type RowType Row type with added tracking fields

Usage Examples

// Check if field ID is system field
int fieldId = 1073741824; // SYSTEM_FIELD_ID_START
boolean isSystem = SpecialFields.isSystemField(fieldId); // true

// Check if field name is system field
boolean isKeyField = SpecialFields.isSystemField("_KEY_id"); // true
boolean isSeqNum = SpecialFields.isSystemField("_SEQUENCE_NUMBER"); // true
boolean isRegular = SpecialFields.isSystemField("user_name"); // false

// Check if field is a key field
boolean isKey = SpecialFields.isKeyField("_KEY_user_id"); // true

// Calculate array element field ID
int arrayFieldId = 10;
int elementId = SpecialFields.getArrayElementFieldId(arrayFieldId, 1);
// Returns: 536870911 + 1024 * 10 + 1

// Calculate map key and value field IDs
int mapFieldId = 5;
int keyId = SpecialFields.getMapKeyFieldId(mapFieldId, 1);
// Returns: 536870911 - 1024 * 5 - 1
int valueId = SpecialFields.getMapValueFieldId(mapFieldId, 1);
// Returns: 536870911 + 1024 * 5 + 1

// Add row tracking fields to a row type
RowType originalType = DataTypes.ROW(
    DataTypes.FIELD(0, "id", DataTypes.INT()),
    DataTypes.FIELD(1, "name", DataTypes.STRING())
);

RowType withTracking = SpecialFields.rowTypeWithRowTracking(originalType);
// Now includes _ROW_ID and _SEQUENCE_NUMBER fields

// Add row tracking with nullable configuration
RowType withNullableTracking = SpecialFields.rowTypeWithRowTracking(
    originalType,
    true,  // rowIdNullable
    true   // sequenceNumberNullable
);

// Add only row ID field
RowType withRowId = SpecialFields.rowTypeWithRowId(originalType);

// Access system field constants
DataField seqNumber = SpecialFields.SEQUENCE_NUMBER;
int seqId = seqNumber.id(); // Integer.MAX_VALUE - 1
String seqName = seqNumber.name(); // "_SEQUENCE_NUMBER"

DataField valueKind = SpecialFields.VALUE_KIND;
DataField level = SpecialFields.LEVEL;
DataField rowKind = SpecialFields.ROW_KIND;

// Check system field names set
Set<String> sysFields = SpecialFields.SYSTEM_FIELD_NAMES;
boolean contains = sysFields.contains("_SEQUENCE_NUMBER"); // true

// Complex nested type field ID calculation example
// For ARRAY(MAP(INT, ARRAY(INT))) with outer array field ID 10:
int outerArrayId = 10;
int mapElementId = SpecialFields.getArrayElementFieldId(outerArrayId, 1);
int mapKeyId = SpecialFields.getMapKeyFieldId(outerArrayId, 2);
int innerArrayValueId = SpecialFields.getMapValueFieldId(outerArrayId, 2);
int innerElementId = SpecialFields.getArrayElementFieldId(outerArrayId, 3);

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment