Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Heuristic:Datahub project Datahub Gradle Formatting Over Direct Tools

From Leeroopedia




Knowledge Sources
Domains Development, Build_System, Code_Quality
Last Updated 2026-02-10 00:00 GMT

Overview

Always use Gradle wrapper tasks for code formatting and linting instead of invoking npm, yarn, ruff, mypy, or prettier directly to ensure CI consistency.

Description

DataHub enforces code quality through Gradle-orchestrated formatting and linting tasks that wrap underlying tools (Prettier, Spotless, ruff, mypy) with the project's specific configuration. Running these tools directly (e.g., `npx prettier`, `ruff check`, `mypy`) bypasses the Gradle configuration, uses potentially different tool versions, and produces results that may not match CI checks. This is a recurring source of "works locally, fails in CI" issues.

Usage

Use this heuristic every time you format or lint code in the DataHub repository. This applies to all languages: Java (Spotless), Python (ruff, mypy), Markdown (Prettier), GraphQL (Prettier), and GitHub Actions YAML (Prettier).

The Insight (Rule of Thumb)

  • Action: Always invoke formatting/linting via `./gradlew` commands, never via direct tool invocation.
  • Value:
    • Python: `./gradlew :metadata-ingestion:lintFix` (NOT `ruff` or `mypy` directly)
    • Java: `./gradlew spotlessApply` (NOT manual IDE formatting)
    • Markdown: `./gradlew :datahub-web-react:mdPrettierWrite` (NOT `npx prettier`)
    • GraphQL: `./gradlew :datahub-web-react:graphqlPrettierWrite`
    • GitHub Actions: `./gradlew :datahub-web-react:githubActionsPrettierWrite`
    • All at once: `./gradlew format` or `./gradlew formatChanged`
  • Trade-off: Gradle tasks are slower to start (JVM warmup) but guarantee CI-identical results.

Reasoning

The Gradle tasks ensure:

  1. Consistent configuration: Gradle tasks use the project's Prettier config, ruff config (pyproject.toml), and Spotless settings.
  2. Correct tool versions: Gradle resolves exact tool versions from dependency declarations, avoiding version drift.
  3. Pre-commit hook alignment: Gradle tasks match exactly what CI runs, eliminating false negatives.
  4. Cross-platform reliability: Works identically across macOS, Linux, and CI runners.

Direct invocation risks include: using a globally installed tool with different version, missing project-specific config files, or applying formatting rules that differ from CI expectations.

Code Evidence

From `CLAUDE.md` (project root):

CRITICAL: Always use Gradle tasks for formatting and linting. Never use npm/yarn/npx commands directly.

❌ NEVER do this:
npx prettier --write "docs/**/*.md"    # WRONG - bypasses Gradle
yarn prettier --write                   # WRONG - bypasses Gradle

✅ ALWAYS do this:
./gradlew :datahub-web-react:mdPrettierWrite      # CORRECT - uses Gradle
./gradlew format                                   # CORRECT - formats everything

Python linting guidance from `CLAUDE.md`:

IMPORTANT: Verifying Python code changes:
- ALWAYS use ./gradlew :metadata-ingestion:lintFix to verify Python code changes
- NEVER use python3 -m py_compile - it doesn't catch style issues or type errors
- NEVER use ruff or mypy commands directly - use the Gradle task instead

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment