Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Environment:Apache Spark Release Build Environment

From Leeroopedia


Knowledge Sources
Domains Infrastructure, Release_Engineering
Last Updated 2026-02-08 22:00 GMT

Overview

Docker-based isolated release environment with GPG signing, ASF Nexus credentials, PyPI API token, and JDK 17 for building official Apache Spark release candidates.

Description

This environment provides the fully isolated, Docker-containerized context used for creating Apache Spark release candidates. The release process runs inside a purpose-built Docker image (spark-rm) that contains JDK 17, Python 3.10, Ruby (for documentation generation), and all build tools. The Docker container receives credentials via a secure environment file, and GPG keys are exported into the container for artifact signing. The release manager must be an ASF PMC member with appropriate credentials for the finalization step.

Usage

Use this environment for all Release Process workflows, including tagging releases, building source and binary tarballs, generating documentation, publishing to Maven Central/Nexus, and finalizing releases. It is the mandatory prerequisite for running the Do_Release_Docker, Release_Tag, Release_Build_Package, Build_Api_Docs, Release_Build_Publish, and Release_Build_Finalize implementations.

System Requirements

Category Requirement Notes
OS Linux or macOS Docker host machine
Docker Docker with build support Required for spark-rm image
Disk 50GB+ free space Source + multiple binary distributions + Maven artifacts
Network Internet access For Maven Central, PyPI, ASF SVN uploads

Dependencies

Host Machine

  • Docker (with docker build and docker run)
  • GPG (for key export to container)
  • Git (for source checkout)

Inside Docker Container (spark-rm image)

  • JDK 17 (OpenJDK)
  • Python 3.10
  • Maven 3.9.12
  • Ruby + Bundler (for Jekyll documentation)
  • R 4.x + SparkR dependencies (for R package)
  • GPG (for artifact signing)
  • SVN client (for ASF dist uploads)
  • lsof (SPARK-22377 workaround)

Credentials

The following environment variables must be set for the release process:

  • ASF_USERNAME: Apache Software Foundation LDAP username
  • ASF_PASSWORD: ASF LDAP password (for SVN operations)
  • ASF_NEXUS_TOKEN: Nexus staging repository token
  • GPG_KEY: GPG key ID for signing artifacts
  • GPG_PASSPHRASE: Passphrase for the GPG key
  • PYPI_API_TOKEN: PyPI API token (required only for finalization step)
  • GIT_NAME: Git committer name
  • GIT_EMAIL: Git committer email

WARNING: Never store actual credential values in code or documentation. These are injected via a secure environment file (chmod 600) that is cleaned up after use.

Quick Install

# Run the release process (from dev/create-release/)
./do-release-docker.sh -d /path/to/workdir

# Dry run (no uploads)
./do-release-docker.sh -d /path/to/workdir -n

# Single step execution
./do-release-docker.sh -d /path/to/workdir -s tag
./do-release-docker.sh -d /path/to/workdir -s build
./do-release-docker.sh -d /path/to/workdir -s docs
./do-release-docker.sh -d /path/to/workdir -s publish
./do-release-docker.sh -d /path/to/workdir -s finalize

Code Evidence

Secure environment file creation from `dev/create-release/do-release-docker.sh:119-121`:

GPG_KEY_FILE="$WORKDIR/gpg.key"
fcreate_secure "$GPG_KEY_FILE"
$GPG --export-secret-key --armor --pinentry-mode loopback --passphrase "$GPG_PASSPHRASE" "$GPG_KEY" > "$GPG_KEY_FILE"

Credential injection via env file from `do-release-docker.sh:143-167`:

cat > $ENVFILE <<EOF
DRY_RUN=$DRY_RUN
ASF_USERNAME=$ASF_USERNAME
ASF_NEXUS_TOKEN=$ASF_NEXUS_TOKEN
GPG_KEY=$GPG_KEY
ASF_PASSWORD=$ASF_PASSWORD
PYPI_API_TOKEN=$PYPI_API_TOKEN
GPG_PASSPHRASE=$GPG_PASSPHRASE
EOF

Secure file creation utility from `dev/create-release/release-util.sh:75-80`:

function fcreate_secure {
  local FPATH="$1"
  rm -f "$FPATH"
  touch "$FPATH"
  chmod 600 "$FPATH"
}

Java version validation for Spark 4.x from `dev/create-release/release-build.sh:622-626`:

elif [[ $JAVA_VERSION < "17.0." ]] && [[ $SPARK_VERSION > "3.5.99" ]]; then
  echo "Java version $JAVA_VERSION is less than required 17 for 4.0+"
  echo "Please set JAVA_HOME correctly."
  exit 1
fi

PMC member finalization gate from `do-release-docker.sh:84-93`:

if [ ! -z "$RELEASE_STEP" ] && [ "$RELEASE_STEP" = "finalize" ]; then
  echo "THIS STEP IS IRREVERSIBLE! Make sure the vote has passed and you pick the right RC to finalize."
  # ... prompt for PMC confirmation
  if [ -z "$PYPI_API_TOKEN" ]; then
    stty -echo && printf "PyPi API token: " && read PYPI_API_TOKEN && printf '\n' && stty echo
  fi
fi

Common Errors

Error Message Cause Solution
`Work directory (-d) must be defined and exist` Missing -d argument Provide a valid working directory path
`Java version X is less than required 17 for 4.0+` Wrong JDK in container Use -j flag to mount correct JDK, or rebuild spark-rm image
GPG signing failures Incorrect GPG_KEY or GPG_PASSPHRASE Verify GPG key ID and passphrase before starting release
Nexus staging upload failure Invalid ASF_NEXUS_TOKEN Refresh token from repository.apache.org
`Command FAILED. Check full logs for details.` Build step failed Check the specific log file shown in output

Compatibility Notes

  • Docker Image: The spark-rm image is built in two layers: a base image with common tools and a branch-specific image with Java/Python versions.
  • Finalization is irreversible: The finalize step promotes artifacts and cannot be undone.
  • Dry Run Mode: Use `-n` flag to perform all builds locally without uploading to ASF infrastructure.
  • GitHub Actions: When running in CI, the `-ti` Docker flag is automatically omitted.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment