Environment:Apache Spark Release Build Environment
| Knowledge Sources | |
|---|---|
| Domains | Infrastructure, Release_Engineering |
| Last Updated | 2026-02-08 22:00 GMT |
Overview
Docker-based isolated release environment with GPG signing, ASF Nexus credentials, PyPI API token, and JDK 17 for building official Apache Spark release candidates.
Description
This environment provides the fully isolated, Docker-containerized context used for creating Apache Spark release candidates. The release process runs inside a purpose-built Docker image (spark-rm) that contains JDK 17, Python 3.10, Ruby (for documentation generation), and all build tools. The Docker container receives credentials via a secure environment file, and GPG keys are exported into the container for artifact signing. The release manager must be an ASF PMC member with appropriate credentials for the finalization step.
Usage
Use this environment for all Release Process workflows, including tagging releases, building source and binary tarballs, generating documentation, publishing to Maven Central/Nexus, and finalizing releases. It is the mandatory prerequisite for running the Do_Release_Docker, Release_Tag, Release_Build_Package, Build_Api_Docs, Release_Build_Publish, and Release_Build_Finalize implementations.
System Requirements
| Category | Requirement | Notes |
|---|---|---|
| OS | Linux or macOS | Docker host machine |
| Docker | Docker with build support | Required for spark-rm image |
| Disk | 50GB+ free space | Source + multiple binary distributions + Maven artifacts |
| Network | Internet access | For Maven Central, PyPI, ASF SVN uploads |
Dependencies
Host Machine
- Docker (with docker build and docker run)
- GPG (for key export to container)
- Git (for source checkout)
Inside Docker Container (spark-rm image)
- JDK 17 (OpenJDK)
- Python 3.10
- Maven 3.9.12
- Ruby + Bundler (for Jekyll documentation)
- R 4.x + SparkR dependencies (for R package)
- GPG (for artifact signing)
- SVN client (for ASF dist uploads)
- lsof (SPARK-22377 workaround)
Credentials
The following environment variables must be set for the release process:
- ASF_USERNAME: Apache Software Foundation LDAP username
- ASF_PASSWORD: ASF LDAP password (for SVN operations)
- ASF_NEXUS_TOKEN: Nexus staging repository token
- GPG_KEY: GPG key ID for signing artifacts
- GPG_PASSPHRASE: Passphrase for the GPG key
- PYPI_API_TOKEN: PyPI API token (required only for finalization step)
- GIT_NAME: Git committer name
- GIT_EMAIL: Git committer email
WARNING: Never store actual credential values in code or documentation. These are injected via a secure environment file (chmod 600) that is cleaned up after use.
Quick Install
# Run the release process (from dev/create-release/)
./do-release-docker.sh -d /path/to/workdir
# Dry run (no uploads)
./do-release-docker.sh -d /path/to/workdir -n
# Single step execution
./do-release-docker.sh -d /path/to/workdir -s tag
./do-release-docker.sh -d /path/to/workdir -s build
./do-release-docker.sh -d /path/to/workdir -s docs
./do-release-docker.sh -d /path/to/workdir -s publish
./do-release-docker.sh -d /path/to/workdir -s finalize
Code Evidence
Secure environment file creation from `dev/create-release/do-release-docker.sh:119-121`:
GPG_KEY_FILE="$WORKDIR/gpg.key"
fcreate_secure "$GPG_KEY_FILE"
$GPG --export-secret-key --armor --pinentry-mode loopback --passphrase "$GPG_PASSPHRASE" "$GPG_KEY" > "$GPG_KEY_FILE"
Credential injection via env file from `do-release-docker.sh:143-167`:
cat > $ENVFILE <<EOF
DRY_RUN=$DRY_RUN
ASF_USERNAME=$ASF_USERNAME
ASF_NEXUS_TOKEN=$ASF_NEXUS_TOKEN
GPG_KEY=$GPG_KEY
ASF_PASSWORD=$ASF_PASSWORD
PYPI_API_TOKEN=$PYPI_API_TOKEN
GPG_PASSPHRASE=$GPG_PASSPHRASE
EOF
Secure file creation utility from `dev/create-release/release-util.sh:75-80`:
function fcreate_secure {
local FPATH="$1"
rm -f "$FPATH"
touch "$FPATH"
chmod 600 "$FPATH"
}
Java version validation for Spark 4.x from `dev/create-release/release-build.sh:622-626`:
elif [[ $JAVA_VERSION < "17.0." ]] && [[ $SPARK_VERSION > "3.5.99" ]]; then
echo "Java version $JAVA_VERSION is less than required 17 for 4.0+"
echo "Please set JAVA_HOME correctly."
exit 1
fi
PMC member finalization gate from `do-release-docker.sh:84-93`:
if [ ! -z "$RELEASE_STEP" ] && [ "$RELEASE_STEP" = "finalize" ]; then
echo "THIS STEP IS IRREVERSIBLE! Make sure the vote has passed and you pick the right RC to finalize."
# ... prompt for PMC confirmation
if [ -z "$PYPI_API_TOKEN" ]; then
stty -echo && printf "PyPi API token: " && read PYPI_API_TOKEN && printf '\n' && stty echo
fi
fi
Common Errors
| Error Message | Cause | Solution |
|---|---|---|
| `Work directory (-d) must be defined and exist` | Missing -d argument | Provide a valid working directory path |
| `Java version X is less than required 17 for 4.0+` | Wrong JDK in container | Use -j flag to mount correct JDK, or rebuild spark-rm image |
| GPG signing failures | Incorrect GPG_KEY or GPG_PASSPHRASE | Verify GPG key ID and passphrase before starting release |
| Nexus staging upload failure | Invalid ASF_NEXUS_TOKEN | Refresh token from repository.apache.org |
| `Command FAILED. Check full logs for details.` | Build step failed | Check the specific log file shown in output |
Compatibility Notes
- Docker Image: The spark-rm image is built in two layers: a base image with common tools and a branch-specific image with Java/Python versions.
- Finalization is irreversible: The finalize step promotes artifacts and cannot be undone.
- Dry Run Mode: Use `-n` flag to perform all builds locally without uploading to ASF infrastructure.
- GitHub Actions: When running in CI, the `-ti` Docker flag is automatically omitted.