Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Apache Hudi HoodieSchemaCompatibility CheckSchemaCompatible

From Leeroopedia


Knowledge Sources
Domains Data_Lake, Schema_Management
Last Updated 2026-02-08 00:00 GMT

Overview

Concrete tool for verifying that a writer schema is backward-compatible with the table schema before any records are written provided by Apache Hudi.

Description

HoodieSchemaCompatibility.checkSchemaCompatible is the primary guard that write paths invoke before producing records. It accepts the table schema, the writer schema, and control flags for validation and projection. The method performs two sequential checks:

  1. Missing field detection -- When projection is not allowed, it calls HoodieSchemaUtils.findMissingFields to compute every table schema field that the writer schema lacks (ignoring partition columns). If any are found, a MissingSchemaFieldException is thrown with the list of missing field names.
  2. Reader/writer compatibility analysis -- When validation is enabled and no partition columns are being dropped, it delegates to HoodieSchemaCompatibilityChecker.checkReaderWriterCompatibility. This method uses a stack-based tree walker (borrowed from Avro 1.10 with Hudi-specific relaxations) that compares each field pair for type compatibility. The result is a SchemaPairCompatibility containing a SchemaCompatibilityType (COMPATIBLE or INCOMPATIBLE) and a human-readable message.

HoodieSchemaCompatibilityChecker.checkReaderWriterCompatibility is the lower-level method that performs the actual tree walk. It constructs a ReaderWriterCompatibilityChecker with an optional naming override flag, runs the compatibility analysis, and wraps the result with context about both schemas.

Usage

Use these methods when:

  • Validating a writer schema before an INSERT, UPSERT, or BULK_INSERT operation.
  • Checking whether an evolved schema can safely read existing table data.
  • Performing pre-commit validation in a streaming Flink pipeline.

Code Reference

Source Location

  • Repository: Apache Hudi
  • File: hudi-common/src/main/java/org/apache/hudi/common/schema/HoodieSchemaCompatibility.java
  • Lines: 91-120
  • File: hudi-common/src/main/java/org/apache/hudi/common/schema/HoodieSchemaCompatibilityChecker.java
  • Lines: 78-101

Signature

public static void checkSchemaCompatible(
    HoodieSchema tableSchema,
    HoodieSchema writerSchema,
    boolean shouldValidate,
    boolean allowProjection,
    Set<String> partitionColumns)
public static SchemaPairCompatibility checkReaderWriterCompatibility(
    final HoodieSchema reader,
    final HoodieSchema writer,
    boolean checkNamingOverride)

Import

import org.apache.hudi.common.schema.HoodieSchemaCompatibility;
import org.apache.hudi.common.schema.HoodieSchemaCompatibilityChecker;
import org.apache.hudi.common.schema.HoodieSchema;
import org.apache.hudi.common.schema.HoodieSchemaField;
import org.apache.hudi.exception.SchemaBackwardsCompatibilityException;
import org.apache.hudi.exception.MissingSchemaFieldException;

I/O Contract

Inputs

Name Type Required Description
tableSchema HoodieSchema Yes The existing table schema to validate against
writerSchema HoodieSchema Yes The writer schema to validate
shouldValidate boolean Yes If true, perform the full reader/writer compatibility check; if false, skip it
allowProjection boolean Yes If true, the writer may omit fields from the table schema; if false, all non-partition fields must be present
partitionColumns Set<String> No Partition column names to exclude from missing-field checks (defaults to empty set)
reader HoodieSchema Yes Reader schema for the lower-level compatibility check
writer HoodieSchema Yes Writer schema for the lower-level compatibility check
checkNamingOverride boolean Yes If true, enforce field name matching during compatibility analysis

Outputs

Name Type Description
(void) void checkSchemaCompatible returns void on success; throws SchemaBackwardsCompatibilityException or MissingSchemaFieldException on failure
SchemaPairCompatibility SchemaPairCompatibility Contains SchemaCompatibilityType (COMPATIBLE or INCOMPATIBLE), the reader schema, the writer schema, and a descriptive message

Usage Examples

// Validate writer schema against table schema (strict mode, no projection)
HoodieSchema tableSchema = HoodieSchema.fromAvroSchema(tableAvroSchema);
HoodieSchema writerSchema = HoodieSchema.fromAvroSchema(writerAvroSchema);

HoodieSchemaCompatibility.checkSchemaCompatible(
    tableSchema,
    writerSchema,
    true,   // shouldValidate
    false,  // allowProjection = false -> all fields required
    Collections.emptySet()
);
// If the above does not throw, the schemas are compatible.

// Lower-level check with detailed result
HoodieSchemaCompatibilityChecker.SchemaPairCompatibility result =
    HoodieSchemaCompatibilityChecker.checkReaderWriterCompatibility(
        writerSchema, tableSchema, true);

if (result.getType() == HoodieSchemaCompatibilityChecker.SchemaCompatibilityType.COMPATIBLE) {
    System.out.println("Schemas are compatible: " + result.getDescription());
} else {
    System.err.println("Incompatible schemas: " + result.getDescription());
}

// Allow projection (e.g., for partial update workloads)
HoodieSchemaCompatibility.checkSchemaCompatible(
    tableSchema,
    partialWriterSchema,
    true,  // shouldValidate
    true,  // allowProjection = true -> writer may omit columns
    Set.of("dt", "region")  // partition columns to exclude
);

Related Pages

Implements Principle

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment