Heuristic:Datahub project Datahub Validation Cross API

Knowledge Sources	DataHub CLAUDE.md Internal
Domains	Architecture, Validation
Last Updated	2026-02-09 17:00 GMT

Overview

Never add validation in API-specific layers (GraphQL resolvers, REST controllers). Always implement AspectPayloadValidators that work across all APIs.

Description

DataHub exposes metadata through multiple API layers: GraphQL, OpenAPI/REST, and RestLI. A common mistake is adding validation logic in one specific API layer (e.g., a GraphQL resolver), which leaves the other APIs unprotected. The correct pattern is to implement AspectPayloadValidator classes that are invoked at the metadata storage layer, ensuring all APIs benefit from the same validation rules.

Usage

Apply this heuristic whenever implementing new validation logic for metadata aspects. Before writing validation code, ask: "Will this check run regardless of which API the user calls?" If the answer is no, refactor to use the AspectPayloadValidator pattern.

The Insight (Rule of Thumb)

Action: Implement validators in metadata-io/src/main/java/com/linkedin/metadata/aspect/validation/
Pattern: Create a class implementing AspectPayloadValidator, register it as a Spring bean in SpringStandardPluginConfiguration.java
Reference implementations: SystemPolicyValidator.java, PolicyFieldTypeValidator.java
Trade-off: Slightly more complex to implement than an inline check in a resolver, but guarantees consistency across all API surfaces.

Reasoning

DataHub users access metadata through different APIs depending on their use case: the UI uses GraphQL, automated pipelines often use REST/OpenAPI, and legacy systems may use RestLI. If validation only exists in GraphQL, a REST client can bypass it entirely. This is a security and data integrity risk. The AspectPayloadValidator pattern hooks into the common metadata storage path, making it impossible to bypass regardless of the entry point.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment