Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Environment:Apache Hudi Java Maven Build Environment

From Leeroopedia


Knowledge Sources
Domains Infrastructure, Build_System
Last Updated 2026-02-08 20:00 GMT

Overview

Java 11/17, Maven 3.6+, and Scala 2.12/2.13 build environment for compiling Apache Hudi from source across multiple Spark and Flink versions.

Description

This environment defines the build-time prerequisites for compiling the Apache Hudi project. The project uses Maven as its build system and targets Java 11 by default, with Java 17 required for Spark 4.0 builds. Scala 2.12 is the default binary version; Scala 2.13 is supported only for Spark bundles (not Flink). The build produces engine-specific bundles (Spark, Flink) that must match the target runtime version.

Usage

Use this environment when building Apache Hudi from source for development, testing, or producing custom bundles. It is the prerequisite for all other Hudi environments (Docker demo, Flink runtime) since they consume the compiled JARs.

System Requirements

Category Requirement Notes
OS Unix-like system (Linux, Mac OS X) Windows not officially supported
Java Java 11 (default) or Java 17 Java 17 required only for Spark 4.0 builds
Maven >= 3.6.0 Maven Enforcer Plugin 3.6.2 used for validation
Git Any recent version Required for source checkout
Disk ~10 GB Full build with all modules

Dependencies

System Packages

  • Java Development Kit (JDK) 11 or 17
  • Apache Maven >= 3.6.0
  • Git

Build Profiles (Spark)

Maven Option Spark Version Scala Version Notes
(empty) 3.5.5 2.12.15 Default
-Dspark3.3 3.3.4 2.12.15 Spark 3.3.2+
-Dspark3.4 3.4.3 2.12.15 Spark 3.4.x
-Dspark3.5 -Dscala-2.13 3.5.5 2.13.8 Scala 2.13 variant
-Dspark4.0 4.0.1 2.13.8 Requires Java 17

Build Profiles (Flink)

Maven Option Flink Version Notes
(empty) 1.20.1 Default
-Dflink1.17 1.17.1 Oldest supported
-Dflink1.18 1.18.1
-Dflink1.19 1.19.2
-Dflink1.20 1.20.1 Same as default
-Dflink2.0 2.0.0
-Dflink2.1 2.1.1 Newest supported

Key Library Dependencies

  • hadoop = 2.10.2
  • hive = 2.3.10
  • avro = 1.11.4
  • parquet = 1.13.1
  • kafka = 2.0.0
  • rocksdbjni = 7.5.3

Credentials

No credentials required for building from source.

Quick Install

# Prerequisites: Java 11+, Maven 3.6+, Git

# Clone and build with defaults (Spark 3.5, Flink 1.20, Scala 2.12)
git clone https://github.com/apache/hudi.git && cd hudi
mvn clean package -DskipTests

# Build for specific Flink version
mvn clean package -DskipTests -Dflink1.18

# Build for Spark 4.0 (requires Java 17)
mvn clean package -DskipTests -Dspark4.0

Code Evidence

Java version defined in pom.xml:97:

<java.version>11</java.version>

Flink version matrix from pom.xml:146-154:

<flink2.1.version>2.1.1</flink2.1.version>
<flink2.0.version>2.0.0</flink2.0.version>
<flink1.20.version>1.20.1</flink1.20.version>
<flink1.19.version>1.19.2</flink1.19.version>
<flink1.18.version>1.18.1</flink1.18.version>
<flink1.17.version>1.17.1</flink1.17.version>
<flink.version>${flink1.20.version}</flink.version>
<hudi.flink.module>hudi-flink1.20.x</hudi.flink.module>
<flink.bundle.version>1.20</flink.bundle.version>

Scala version restrictions from README.md:141-142:

Hudi Flink bundle cannot be built using scala-2.13 profile.

Common Errors

Error Message Cause Solution
Maven build failure with Java version mismatch Using Java 8 or Java 17 for non-Spark-4.0 build Install Java 11 and set JAVA_HOME accordingly
Spark 4.0 bundle build fails Building Spark 4.0 with Java 11 Switch to Java 17: export JAVA_HOME=/path/to/jdk17
Flink bundle Scala 2.13 build fails Attempting -Dscala-2.13 with Flink modules Flink bundles only support Scala 2.12; remove -Dscala-2.13 or exclude Flink modules
Maven version too old Maven < 3.6.0 Upgrade Maven to 3.6.0+

Compatibility Notes

  • Scala 2.13: Only supported for Spark-related bundles (hudi-spark-bundle, hudi-utilities-bundle, hudi-utilities-slim-bundle). Flink bundles cannot be built with Scala 2.13.
  • Spark 4.0: Requires Java 17 and Scala 2.13. Cannot be built with Java 11 or Scala 2.12.
  • Flink is Scala-free: Since Flink 1.15.x, no Scala version specification is needed for Flink builds.
  • Cross-version builds: Each Spark/Flink version produces a separate bundle JAR; bundles are not interchangeable across major versions.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment