Environment:Datahub project Datahub Java Build
| Knowledge Sources | |
|---|---|
| Domains | Infrastructure, Build_System |
| Last Updated | 2026-02-09 17:00 GMT |
Overview
Java 17 and Gradle 8.14.3 build environment required for compiling DataHub backend services, the Java SDK, and the Spark lineage agent.
Description
This environment provides the JDK and Gradle build toolchain for all Java-based DataHub modules including GMS (the metadata service), the GraphQL core, the Java SDK (datahub-client), and the Spark lineage agent (acryl-spark-lineage). The build uses Gradle with dependency locking for reproducible builds. Spring Boot 3.x and Spring Framework 6.x modules have a hard dependency on Java 17; other modules default to Java 17 but can be overridden via the -PjdkVersionDefault property.
Usage
Use this environment for building DataHub from source, running Java unit/integration tests, and developing Java SDK clients or Spark lineage agents. It is the mandatory prerequisite for the Datahub_Client_Maven_Dependency, Spark_Submit_Agent_JAR, and RestEmitter_Create implementations.
System Requirements
| Category | Requirement | Notes |
|---|---|---|
| OS | Linux, macOS, Windows | Cross-platform via Gradle wrapper |
| JDK | Java 17 (required) | OpenJDK 17 recommended; Spring 6/Spring Boot 3.x requires exactly Java 17+ |
| RAM | 8+ GB | Docker engine needs 8 GB for running integration tests |
| Disk | 10+ GB | Gradle cache, compiled artifacts, and Docker images |
Dependencies
System Packages
openjdk-17-jdk(or equivalent JDK 17 distribution)git(for Gradle git properties plugin)
Build Tool
Gradle8.14.3 (via Gradle Wrapper, no manual installation needed)
Key Library Versions (managed by Gradle)
- Spring Framework 6.2.11
- Spring Boot 3.4.5
- Pegasus (Rest.li) 29.74.2
- Jackson 2.18.4
- Kafka Client 8.0.0
- Elasticsearch Java Client 2.19.4 (for ES 7.10/OpenSearch)
- Elasticsearch 8 Java Client 8.17.4 (for ES 8.x)
- Neo4j Driver 5.20.0
- JUnit Jupiter 5.6.1
- TestContainers 1.21.1
Credentials
No credentials required for building. For integration tests:
DATAHUB_GMS_URL: URL of running GMS instance (for integration tests only)DATAHUB_GMS_TOKEN: Auth token (for integration tests only)
Quick Install
# Install JDK 17 (Ubuntu)
sudo apt install openjdk-17-jdk
# Or use mise (recommended by the project)
# mise.toml defines: java=17, node=22, python=3.11
# Build entire project
./gradlew build
# Run all checks
./gradlew check
# Format Java code
./gradlew spotlessApply
Code Evidence
JDK version configuration from build.gradle:5-6:
ext.jdkVersionDefault = 17
ext.javaClassVersionDefault = 11
Spring 6 hard dependency on Java 17 from build.gradle:11-21:
ext.jdkVersion = { p ->
// If Spring 6 is present, hard dependency on jdk17
if (p.configurations.any { it.getDependencies().any{
(it.getGroup().equals("org.springframework") && it.getVersion().startsWith("6."))
|| (it.getGroup().equals("org.springframework.boot") && it.getVersion().startsWith("3."))
}}) {
return 17
} else {
return p.hasProperty('jdkVersionDefault') ? Integer.valueOf((String) p.getProperty('jdkVersionDefault')) : ext.jdkVersionDefault
}
}
Gradle version from build.gradle:40:
ext.versionGradle = '8.14.3'
Common Errors
| Error Message | Cause | Solution |
|---|---|---|
Unsupported class file major version 61 |
Wrong JDK version (need 17) | Install and configure JDK 17: export JAVA_HOME=/path/to/jdk17
|
| Git properties generation failure with worktrees | generateGitPropertiesGlobal task fails in git worktrees |
Add -x generateGitPropertiesGlobal to Gradle commands
|
| Dependency lock file mismatch | Someone updated a dependency without regenerating lock files | Run ./gradlew dependencies --write-locks
|
Compatibility Notes
- Java 11 bytecode: Although JDK 17 is required for compilation, most modules target Java 11 class file format for broader runtime compatibility. Spring 6 modules target Java 17 bytecode.
- Gradle Wrapper: Always use
./gradlew(not a system-installed Gradle) to ensure version consistency. - Dependency Locking: The project uses Gradle dependency locking (
gradle.lockfileper project) for reproducible builds.
Related Pages
- Implementation:Datahub_project_Datahub_Datahub_Client_Maven_Dependency
- Implementation:Datahub_project_Datahub_RestEmitter_Create
- Implementation:Datahub_project_Datahub_Spark_Submit_Agent_JAR
- Implementation:Datahub_project_Datahub_DatahubSparkListener_Lifecycle
- Implementation:Datahub_project_Datahub_DatahubEventEmitter_Emit
- Implementation:Datahub_project_Datahub_SparkConfigParser_Configuration
- Implementation:Datahub_project_Datahub_OpenLineageToDataHub_ConvertRunEvent
- Implementation:Datahub_project_Datahub_EntityClient_Upsert
- Implementation:Datahub_project_Datahub_EntityClient_Get
- Implementation:Datahub_project_Datahub_Entity_Mutable_Patch
- Implementation:Datahub_project_Datahub_DataHubClientV2_Builder
- Implementation:Datahub_project_Datahub_DataHubClientV2_Close
- Implementation:Datahub_project_Datahub_Dataset_Builder
- Implementation:Datahub_project_Datahub_MetadataChangeProposalWrapper_Builder
- Implementation:Datahub_project_Datahub_Emitter_Emit
- Implementation:Datahub_project_Datahub_Emitter_Close
- Implementation:Datahub_project_Datahub_Proto2DataHub_Main
- Implementation:Datahub_project_Datahub_Proto2DataHub_RestEmitter_Emit
- Implementation:Datahub_project_Datahub_ProtobufDataset_Builder
- Implementation:Datahub_project_Datahub_Meta_Proto_Custom_Options
- Implementation:Datahub_project_Datahub_Protoc_Descriptor_Set_Out