Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Apache Hudi SchemaChangeUtils IsTypeUpdateAllow

From Leeroopedia


Knowledge Sources
Domains Data_Lake, Schema_Management
Last Updated 2026-02-08 00:00 GMT

Overview

Concrete tool for validating column type promotions and detecting schema changes between InternalSchema versions provided by Apache Hudi.

Description

SchemaChangeUtils.isTypeUpdateAllow is the gate-keeper method that decides whether a column type promotion is permitted. It enforces a hard-coded type lattice: INT can widen to LONG, FLOAT, DOUBLE, STRING, or DECIMAL; LONG can widen to FLOAT, DOUBLE, STRING, or DECIMAL; FLOAT can widen to DOUBLE, STRING, or DECIMAL; DOUBLE can widen to STRING or DECIMAL; STRING can convert to DATE, DECIMAL, or BINARY; DATE and BINARY can convert to STRING; and DECIMAL can widen to another DECIMAL (if precision and scale are compatible) or to STRING. Nested types are rejected outright.

InternalSchemaUtils.collectTypeChangedCols compares two InternalSchema versions and returns a map keyed by the top-level field position, where each value is a pair of (new type, old type). It iterates over the intersection of field IDs present in both schemas, checks for type inequality, and maps the result back to the position of the top-level parent field in the new schema's record.

InternalSchemaUtils.collectRenameCols detects columns that share the same field ID across two schema versions but have different fully-qualified names. It returns a map from the new name to the last segment of the old name, which downstream readers use to locate the column in files written under the old schema.

Usage

Use these methods when:

  • Validating an ALTER TABLE type change before committing it to the Hudi timeline.
  • Building a CastMap or projection that must reconcile file-era types with query-era types.
  • Detecting renamed columns so that Parquet readers can locate data under the old column name.

Code Reference

Source Location

  • Repository: Apache Hudi
  • File: hudi-common/src/main/java/org/apache/hudi/internal/schema/utils/SchemaChangeUtils.java
  • Lines: 55-63
  • File: hudi-common/src/main/java/org/apache/hudi/internal/schema/utils/InternalSchemaUtils.java
  • Lines: 206-226 (collectTypeChangedCols), 277-287 (collectRenameCols)

Signature

public static boolean isTypeUpdateAllow(Type src, Type dst)
public static Map<Integer, Pair<Type, Type>> collectTypeChangedCols(
    InternalSchema schema, InternalSchema oldSchema)
public static Map<String, String> collectRenameCols(
    InternalSchema oldSchema, InternalSchema newSchema)

Import

import org.apache.hudi.internal.schema.utils.SchemaChangeUtils;
import org.apache.hudi.internal.schema.utils.InternalSchemaUtils;
import org.apache.hudi.internal.schema.InternalSchema;
import org.apache.hudi.internal.schema.Type;
import org.apache.hudi.internal.schema.Types;
import org.apache.hudi.common.util.collection.Pair;
import org.apache.hudi.internal.schema.action.TableChanges;

I/O Contract

Inputs

Name Type Required Description
src Type Yes The original column type (must be a primitive type, not nested)
dst Type Yes The target column type to promote to (must be a primitive type, not nested)
schema InternalSchema Yes The new (type-changed) InternalSchema version
oldSchema InternalSchema Yes The previous InternalSchema version for comparison
newSchema InternalSchema Yes The modified InternalSchema version (for rename detection)

Outputs

Name Type Description
isTypeUpdateAllow result boolean True if the type promotion from src to dst is allowed by the type lattice; false otherwise
collectTypeChangedCols result Map<Integer, Pair<Type, Type>> Map from top-level field position to (newType, oldType) pair for every column whose type changed
collectRenameCols result Map<String, String> Map from new fully-qualified column name to the last segment of the old column name

Usage Examples

// Validate a type promotion from INT to LONG
boolean allowed = SchemaChangeUtils.isTypeUpdateAllow(
    Types.IntType.get(),
    Types.LongType.get());
// allowed == true

// Validate an illegal promotion from STRING to INT
boolean illegal = SchemaChangeUtils.isTypeUpdateAllow(
    Types.StringType.get(),
    Types.IntType.get());
// illegal == false

// Detect type changes between two schema versions
InternalSchema oldSchema = ... ; // version 0: column "age" is INT
InternalSchema newSchema = ... ; // version 1: column "age" is LONG
Map<Integer, Pair<Type, Type>> changed =
    InternalSchemaUtils.collectTypeChangedCols(newSchema, oldSchema);
// changed contains: {positionOfAge -> (LongType, IntType)}

// Detect renamed columns between two schema versions
InternalSchema schemaV0 = ... ; // column id=5 named "user_name"
InternalSchema schemaV1 = ... ; // column id=5 named "full_name"
Map<String, String> renames =
    InternalSchemaUtils.collectRenameCols(schemaV0, schemaV1);
// renames contains: {"full_name" -> "user_name"}

Related Pages

Implements Principle

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment