Heuristic:Datahub project Datahub Validation Across All APIs
| Knowledge Sources | |
|---|---|
| Domains | Architecture, Validation, Backend |
| Last Updated | 2026-02-10 00:00 GMT |
Overview
Architecture rule requiring all metadata validation to be implemented as AspectPayloadValidators in the metadata-io layer, never in API-specific layers like GraphQL resolvers or REST controllers.
Description
DataHub exposes metadata modification through multiple API surfaces: GraphQL, OpenAPI (REST), and RestLI. A common mistake is adding validation logic in a single API layer (e.g., a GraphQL resolver), which leaves the other APIs unprotected. The correct approach is to implement validation as `AspectPayloadValidator` classes in the `metadata-io` module, which is the shared layer that all APIs funnel through. This ensures validation runs regardless of which API surface the request arrives through.
Usage
Use this heuristic whenever adding new validation logic for metadata aspects, entity properties, or business rules. This includes field-level validation, cross-aspect consistency checks, and policy enforcement. If you find yourself writing validation in a GraphQL resolver or REST controller, stop and refactor to an AspectPayloadValidator.
The Insight (Rule of Thumb)
- Action: Implement all validation as `AspectPayloadValidator` classes in `metadata-io/src/main/java/com/linkedin/metadata/aspect/validation/`.
- Value: Register each validator as a Spring bean in `SpringStandardPluginConfiguration.java`.
- Trade-off: Slightly more complex initial setup (Spring bean registration) but guarantees consistent validation across all API surfaces.
Reasoning
DataHub's architecture routes all metadata changes through a single aspect storage layer regardless of the originating API. By placing validation at this layer:
- GraphQL mutations are validated.
- OpenAPI/REST endpoints are validated.
- RestLI endpoints are validated.
- Programmatic SDK calls are validated.
Placing validation in any single API layer creates a security/consistency gap where invalid data can enter through another API. This has been a source of production bugs and is explicitly called out in the project's contributor guidelines.
Code Evidence
From `CLAUDE.md` (Validation Architecture section):
IMPORTANT: Validation must work across all APIs (GraphQL, OpenAPI, RestLI).
- Never add validation in API-specific layers (GraphQL resolvers, REST controllers)
- this only protects one API
- Always implement AspectPayloadValidators in
metadata-io/src/main/java/com/linkedin/metadata/aspect/validation/
- Register as Spring beans in SpringStandardPluginConfiguration.java
- Follow existing patterns: See SystemPolicyValidator.java and
PolicyFieldTypeValidator.java as examples