Principle:Datahub project Datahub MCP Construction
| Property | Value |
|---|---|
| Principle Name | MCP_Construction |
| Category | Java_SDK_Metadata_Emission |
| Workflow | Java_SDK_Metadata_Emission |
| Repository | https://github.com/datahub-project/datahub |
| Last Updated | 2026-02-09 17:00 GMT |
Overview
Description
MCP Construction is the principle of assembling metadata change proposals (MCPs) as self-contained, atomic units of metadata mutation. A Metadata Change Proposal encapsulates a single intent to modify one aspect of one entity in the DataHub metadata graph. The construction process binds together the entity type (e.g., "dataset", "dashboard"), the entity URN (a globally unique identifier), the change type (e.g., UPSERT), and the aspect (a typed metadata payload such as DatasetProperties or SchemaMetadata) into a single immutable object that can be serialized and transmitted to the metadata platform.
The DataHub Java SDK provides MetadataChangeProposalWrapper as the primary abstraction for constructing MCPs. This wrapper uses a step builder pattern that guides the developer through a mandatory sequence of construction steps, ensuring that all required fields are populated and validated before the proposal is finalized.
Usage
MCP construction occurs after emitter instantiation and before metadata emission. Every metadata update -- whether creating a new dataset, updating a description, adding tags, setting ownership, or recording lineage -- is expressed as one or more MCPs. Common usage scenarios include:
- Setting dataset properties such as descriptions, custom properties, and external URLs.
- Recording schema metadata with field names, types, and descriptions.
- Establishing ownership by associating users or groups with data assets.
- Creating lineage by emitting upstream/downstream dataset relationships.
- Tagging and glossary terms by attaching classification metadata to entities.
Theoretical Basis
MCP Construction draws on several foundational design patterns:
Command Pattern -- Each MetadataChangeProposalWrapper represents a command object that encapsulates a metadata mutation request. The command contains all the information needed to execute the mutation (entity type, URN, change type, aspect payload) and can be serialized, queued, transmitted, and replayed independently. This decouples the intent to mutate metadata from the actual execution of that mutation.
Step Builder Pattern -- The MetadataChangeProposalWrapper.builder() method returns an EntityTypeStepBuilder interface, which enforces a specific construction sequence: entityType() returns EntityUrnStepBuilder, which offers entityUrn() returning ChangeStepBuilder, which offers upsert() returning AspectStepBuilder, which offers aspect() returning Build. Each step constrains the available methods to only those valid at that point in the construction sequence, making it impossible to forget a required field or set fields out of order. This compile-time safety is stronger than a traditional builder that relies on runtime validation.
Aspect Name Inference -- When the caller provides a typed aspect object (e.g., new DatasetProperties()), the builder inspects the aspect's PDL schema properties to automatically infer the aspect name (e.g., "datasetProperties"). This convention-over-configuration approach reduces boilerplate while still allowing explicit aspectName() overrides when needed.
URN Validation -- The builder validates the entity URN during construction by attempting to parse it via Urn.createFromString(). Malformed URNs trigger an EventValidationException at build time rather than at emission time, following the fail-fast principle.
Immutability -- The constructed MetadataChangeProposalWrapper is annotated with Lombok @Value, making all fields final and the object immutable after construction. This guarantees thread safety and prevents accidental modification between construction and emission.