Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Apache Paimon DataTypeJsonParser

From Leeroopedia
Revision as of 14:20, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Apache_Paimon_DataTypeJsonParser.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Knowledge Sources
Domains Type System, Serialization
Last Updated 2026-02-08 00:00 GMT

Overview

DataTypeJsonParser parses DataType and DataField instances from JSON representations and SQL-style type strings.

Description

DataTypeJsonParser is a critical serialization utility that handles bidirectional conversion between string representations and DataType objects. It supports two input formats: JSON objects (created by DataType.serializeJson) and SQL-style type strings (e.g., "VARCHAR(100)", "ROW<id INT, name STRING>"). The parser is essential for reading table schemas from metadata files, processing DDL statements, and supporting REST APIs.

The parser handles complex nested types recursively, including ARRAY, MAP, ROW, MULTISET, and VECTOR types. For JSON input, it distinguishes between simple textual type names and complex object structures with nested elements. The SQL string parser uses a tokenizer that recognizes keywords, identifiers, parameters, and structural characters, then builds a parse tree using recursive descent.

A unique feature is the automatic field ID assignment for JSON inputs that lack explicit IDs. The parser accepts an AtomicInteger fieldId parameter that auto-increments when IDs are missing, but validates that partial IDs are not mixed with auto-assigned IDs. This supports both legacy schemas (without field IDs) and modern schemas (with explicit IDs) seamlessly.

Usage

Use DataTypeJsonParser when deserializing table schemas from storage, parsing DDL type specifications, or implementing REST API endpoints that accept type definitions. The parser is stateless and thread-safe for concurrent use.

Code Reference

Source Location

Signature

public final class DataTypeJsonParser {
    public static DataField parseDataField(JsonNode json);

    public static DataType parseDataType(JsonNode json);

    public static DataType parseDataType(
        JsonNode json,
        AtomicInteger fieldId
    );

    public static DataType parseAtomicTypeSQLString(String string);
}

Import

import org.apache.paimon.types.DataTypeJsonParser;

I/O Contract

Inputs

Name Type Required Description
json JsonNode yes JSON node containing type definition
string String yes SQL-style type string (e.g., "VARCHAR(100)")
fieldId AtomicInteger no Counter for auto-assigning field IDs

Outputs

Name Type Description
dataType DataType Parsed data type instance
dataField DataField Parsed field with ID, name, type, and metadata

Usage Examples

Parsing SQL Type Strings

// Parse simple atomic types
DataType intType = DataTypeJsonParser.parseAtomicTypeSQLString("INT");
DataType varcharType = DataTypeJsonParser.parseAtomicTypeSQLString("VARCHAR(100)");
DataType decimalType = DataTypeJsonParser.parseAtomicTypeSQLString("DECIMAL(18,2)");

// Parse timestamp with time zone
DataType timestampType = DataTypeJsonParser.parseAtomicTypeSQLString(
    "TIMESTAMP(3) WITH LOCAL TIME ZONE"
);

// Parse with NOT NULL constraint
DataType notNullType = DataTypeJsonParser.parseAtomicTypeSQLString(
    "VARCHAR(50) NOT NULL"
);

Parsing JSON Type Definitions

// Parse from JSON node
ObjectMapper mapper = new ObjectMapper();
JsonNode typeNode = mapper.readTree("{\"type\": \"VARCHAR(100)\"}");
DataType parsedType = DataTypeJsonParser.parseDataType(typeNode);

// Parse complex nested type
String complexJson = """
{
  "type": "ROW",
  "fields": [
    {"id": 0, "name": "id", "type": "BIGINT"},
    {"id": 1, "name": "name", "type": "STRING"},
    {"id": 2, "name": "tags", "type": {"type": "ARRAY", "element": "STRING"}}
  ]
}
""";
JsonNode complexNode = mapper.readTree(complexJson);
DataType rowType = DataTypeJsonParser.parseDataType(complexNode);

Parsing DataFields with Auto ID Assignment

// Parse field with explicit ID
String fieldJson = """
{
  "id": 5,
  "name": "user_id",
  "type": "BIGINT",
  "description": "User identifier"
}
""";
JsonNode fieldNode = mapper.readTree(fieldJson);
DataField field = DataTypeJsonParser.parseDataField(fieldNode);

// Parse field with auto-assigned ID
AtomicInteger idCounter = new AtomicInteger(0);
String fieldWithoutId = """
{
  "name": "email",
  "type": "STRING",
  "description": "Email address"
}
""";
JsonNode fieldNode2 = mapper.readTree(fieldWithoutId);
DataType fieldType = DataTypeJsonParser.parseDataType(
    fieldNode2.get("type"),
    idCounter
);
// ID will be auto-assigned as 1

Parsing Nested Collection Types

// Parse ARRAY type
DataType arrayType = DataTypeJsonParser.parseAtomicTypeSQLString(
    "ARRAY<STRING>"
);

// Parse MAP type
DataType mapType = DataTypeJsonParser.parseAtomicTypeSQLString(
    "MAP<STRING, INT>"
);

// Parse VECTOR type
DataType vectorType = DataTypeJsonParser.parseAtomicTypeSQLString(
    "VECTOR<FLOAT, 128>"
);

// Parse nested ROW type
String nestedRowSql = "ROW<id BIGINT, address ROW<street STRING, city STRING>>";
DataType nestedType = DataTypeJsonParser.parseAtomicTypeSQLString(nestedRowSql);

Error Handling

try {
    // Invalid type string
    DataType invalid = DataTypeJsonParser.parseAtomicTypeSQLString(
        "INVALID_TYPE(100)"
    );
} catch (IllegalArgumentException e) {
    // Handle parsing error
    System.err.println("Parse error: " + e.getMessage());
}

// Validate mixed field ID usage
try {
    String mixedIdJson = """
    {
      "type": "ROW",
      "fields": [
        {"id": 0, "name": "id", "type": "INT"},
        {"name": "name", "type": "STRING"}
      ]
    }
    """;
    AtomicInteger counter = new AtomicInteger(-1);
    JsonNode node = mapper.readTree(mixedIdJson);
    DataType type = DataTypeJsonParser.parseDataType(node, counter);
    // Throws: "Partial field id is not allowed"
} catch (IllegalStateException e) {
    System.err.println("Mixed ID error: " + e.getMessage());
}

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment