Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Heuristic:Datahub project Datahub Validation Across All APIs

From Leeroopedia



Knowledge Sources
Domains Architecture, Validation, Backend
Last Updated 2026-02-10 00:00 GMT

Overview

Architecture rule requiring all metadata validation to be implemented as AspectPayloadValidators in the metadata-io layer, never in API-specific layers like GraphQL resolvers or REST controllers.

Description

DataHub exposes metadata modification through multiple API surfaces: GraphQL, OpenAPI (REST), and RestLI. A common mistake is adding validation logic in a single API layer (e.g., a GraphQL resolver), which leaves the other APIs unprotected. The correct approach is to implement validation as `AspectPayloadValidator` classes in the `metadata-io` module, which is the shared layer that all APIs funnel through. This ensures validation runs regardless of which API surface the request arrives through.

Usage

Use this heuristic whenever adding new validation logic for metadata aspects, entity properties, or business rules. This includes field-level validation, cross-aspect consistency checks, and policy enforcement. If you find yourself writing validation in a GraphQL resolver or REST controller, stop and refactor to an AspectPayloadValidator.

The Insight (Rule of Thumb)

  • Action: Implement all validation as `AspectPayloadValidator` classes in `metadata-io/src/main/java/com/linkedin/metadata/aspect/validation/`.
  • Value: Register each validator as a Spring bean in `SpringStandardPluginConfiguration.java`.
  • Trade-off: Slightly more complex initial setup (Spring bean registration) but guarantees consistent validation across all API surfaces.

Reasoning

DataHub's architecture routes all metadata changes through a single aspect storage layer regardless of the originating API. By placing validation at this layer:

  1. GraphQL mutations are validated.
  2. OpenAPI/REST endpoints are validated.
  3. RestLI endpoints are validated.
  4. Programmatic SDK calls are validated.

Placing validation in any single API layer creates a security/consistency gap where invalid data can enter through another API. This has been a source of production bugs and is explicitly called out in the project's contributor guidelines.

Code Evidence

From `CLAUDE.md` (Validation Architecture section):

IMPORTANT: Validation must work across all APIs (GraphQL, OpenAPI, RestLI).

- Never add validation in API-specific layers (GraphQL resolvers, REST controllers)
  - this only protects one API
- Always implement AspectPayloadValidators in
  metadata-io/src/main/java/com/linkedin/metadata/aspect/validation/
- Register as Spring beans in SpringStandardPluginConfiguration.java
- Follow existing patterns: See SystemPolicyValidator.java and
  PolicyFieldTypeValidator.java as examples

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment