Implementation:Apache Hudi HoodieSchemaCompatibility CheckSchemaCompatible
| Knowledge Sources | |
|---|---|
| Domains | Data_Lake, Schema_Management |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
Concrete tool for verifying that a writer schema is backward-compatible with the table schema before any records are written provided by Apache Hudi.
Description
HoodieSchemaCompatibility.checkSchemaCompatible is the primary guard that write paths invoke before producing records. It accepts the table schema, the writer schema, and control flags for validation and projection. The method performs two sequential checks:
- Missing field detection -- When projection is not allowed, it calls
HoodieSchemaUtils.findMissingFieldsto compute every table schema field that the writer schema lacks (ignoring partition columns). If any are found, aMissingSchemaFieldExceptionis thrown with the list of missing field names. - Reader/writer compatibility analysis -- When validation is enabled and no partition columns are being dropped, it delegates to
HoodieSchemaCompatibilityChecker.checkReaderWriterCompatibility. This method uses a stack-based tree walker (borrowed from Avro 1.10 with Hudi-specific relaxations) that compares each field pair for type compatibility. The result is aSchemaPairCompatibilitycontaining aSchemaCompatibilityType(COMPATIBLE or INCOMPATIBLE) and a human-readable message.
HoodieSchemaCompatibilityChecker.checkReaderWriterCompatibility is the lower-level method that performs the actual tree walk. It constructs a ReaderWriterCompatibilityChecker with an optional naming override flag, runs the compatibility analysis, and wraps the result with context about both schemas.
Usage
Use these methods when:
- Validating a writer schema before an INSERT, UPSERT, or BULK_INSERT operation.
- Checking whether an evolved schema can safely read existing table data.
- Performing pre-commit validation in a streaming Flink pipeline.
Code Reference
Source Location
- Repository: Apache Hudi
- File:
hudi-common/src/main/java/org/apache/hudi/common/schema/HoodieSchemaCompatibility.java - Lines: 91-120
- File:
hudi-common/src/main/java/org/apache/hudi/common/schema/HoodieSchemaCompatibilityChecker.java - Lines: 78-101
Signature
public static void checkSchemaCompatible(
HoodieSchema tableSchema,
HoodieSchema writerSchema,
boolean shouldValidate,
boolean allowProjection,
Set<String> partitionColumns)
public static SchemaPairCompatibility checkReaderWriterCompatibility(
final HoodieSchema reader,
final HoodieSchema writer,
boolean checkNamingOverride)
Import
import org.apache.hudi.common.schema.HoodieSchemaCompatibility;
import org.apache.hudi.common.schema.HoodieSchemaCompatibilityChecker;
import org.apache.hudi.common.schema.HoodieSchema;
import org.apache.hudi.common.schema.HoodieSchemaField;
import org.apache.hudi.exception.SchemaBackwardsCompatibilityException;
import org.apache.hudi.exception.MissingSchemaFieldException;
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| tableSchema | HoodieSchema |
Yes | The existing table schema to validate against |
| writerSchema | HoodieSchema |
Yes | The writer schema to validate |
| shouldValidate | boolean |
Yes | If true, perform the full reader/writer compatibility check; if false, skip it |
| allowProjection | boolean |
Yes | If true, the writer may omit fields from the table schema; if false, all non-partition fields must be present |
| partitionColumns | Set<String> |
No | Partition column names to exclude from missing-field checks (defaults to empty set) |
| reader | HoodieSchema |
Yes | Reader schema for the lower-level compatibility check |
| writer | HoodieSchema |
Yes | Writer schema for the lower-level compatibility check |
| checkNamingOverride | boolean |
Yes | If true, enforce field name matching during compatibility analysis |
Outputs
| Name | Type | Description |
|---|---|---|
| (void) | void |
checkSchemaCompatible returns void on success; throws SchemaBackwardsCompatibilityException or MissingSchemaFieldException on failure |
| SchemaPairCompatibility | SchemaPairCompatibility |
Contains SchemaCompatibilityType (COMPATIBLE or INCOMPATIBLE), the reader schema, the writer schema, and a descriptive message |
Usage Examples
// Validate writer schema against table schema (strict mode, no projection)
HoodieSchema tableSchema = HoodieSchema.fromAvroSchema(tableAvroSchema);
HoodieSchema writerSchema = HoodieSchema.fromAvroSchema(writerAvroSchema);
HoodieSchemaCompatibility.checkSchemaCompatible(
tableSchema,
writerSchema,
true, // shouldValidate
false, // allowProjection = false -> all fields required
Collections.emptySet()
);
// If the above does not throw, the schemas are compatible.
// Lower-level check with detailed result
HoodieSchemaCompatibilityChecker.SchemaPairCompatibility result =
HoodieSchemaCompatibilityChecker.checkReaderWriterCompatibility(
writerSchema, tableSchema, true);
if (result.getType() == HoodieSchemaCompatibilityChecker.SchemaCompatibilityType.COMPATIBLE) {
System.out.println("Schemas are compatible: " + result.getDescription());
} else {
System.err.println("Incompatible schemas: " + result.getDescription());
}
// Allow projection (e.g., for partial update workloads)
HoodieSchemaCompatibility.checkSchemaCompatible(
tableSchema,
partialWriterSchema,
true, // shouldValidate
true, // allowProjection = true -> writer may omit columns
Set.of("dt", "region") // partition columns to exclude
);