Heuristic:Datahub project Datahub Gradle Formatting Over Direct Tools
| Knowledge Sources | |
|---|---|
| Domains | Development, Build_System, Code_Quality |
| Last Updated | 2026-02-10 00:00 GMT |
Overview
Always use Gradle wrapper tasks for code formatting and linting instead of invoking npm, yarn, ruff, mypy, or prettier directly to ensure CI consistency.
Description
DataHub enforces code quality through Gradle-orchestrated formatting and linting tasks that wrap underlying tools (Prettier, Spotless, ruff, mypy) with the project's specific configuration. Running these tools directly (e.g., `npx prettier`, `ruff check`, `mypy`) bypasses the Gradle configuration, uses potentially different tool versions, and produces results that may not match CI checks. This is a recurring source of "works locally, fails in CI" issues.
Usage
Use this heuristic every time you format or lint code in the DataHub repository. This applies to all languages: Java (Spotless), Python (ruff, mypy), Markdown (Prettier), GraphQL (Prettier), and GitHub Actions YAML (Prettier).
The Insight (Rule of Thumb)
- Action: Always invoke formatting/linting via `./gradlew` commands, never via direct tool invocation.
- Value:
- Python: `./gradlew :metadata-ingestion:lintFix` (NOT `ruff` or `mypy` directly)
- Java: `./gradlew spotlessApply` (NOT manual IDE formatting)
- Markdown: `./gradlew :datahub-web-react:mdPrettierWrite` (NOT `npx prettier`)
- GraphQL: `./gradlew :datahub-web-react:graphqlPrettierWrite`
- GitHub Actions: `./gradlew :datahub-web-react:githubActionsPrettierWrite`
- All at once: `./gradlew format` or `./gradlew formatChanged`
- Trade-off: Gradle tasks are slower to start (JVM warmup) but guarantee CI-identical results.
Reasoning
The Gradle tasks ensure:
- Consistent configuration: Gradle tasks use the project's Prettier config, ruff config (pyproject.toml), and Spotless settings.
- Correct tool versions: Gradle resolves exact tool versions from dependency declarations, avoiding version drift.
- Pre-commit hook alignment: Gradle tasks match exactly what CI runs, eliminating false negatives.
- Cross-platform reliability: Works identically across macOS, Linux, and CI runners.
Direct invocation risks include: using a globally installed tool with different version, missing project-specific config files, or applying formatting rules that differ from CI expectations.
Code Evidence
From `CLAUDE.md` (project root):
CRITICAL: Always use Gradle tasks for formatting and linting. Never use npm/yarn/npx commands directly.
❌ NEVER do this:
npx prettier --write "docs/**/*.md" # WRONG - bypasses Gradle
yarn prettier --write # WRONG - bypasses Gradle
✅ ALWAYS do this:
./gradlew :datahub-web-react:mdPrettierWrite # CORRECT - uses Gradle
./gradlew format # CORRECT - formats everything
Python linting guidance from `CLAUDE.md`:
IMPORTANT: Verifying Python code changes:
- ALWAYS use ./gradlew :metadata-ingestion:lintFix to verify Python code changes
- NEVER use python3 -m py_compile - it doesn't catch style issues or type errors
- NEVER use ruff or mypy commands directly - use the Gradle task instead