Principle:Datahub project Datahub Metadata Change Proposal
Metadata
| Field | Value |
|---|---|
| principle_name | Metadata Change Proposal |
| description | A standardized envelope for packaging metadata changes into atomic, type-safe proposals for emission to DataHub. |
| type | principle |
| status | active |
| last_updated | 2026-02-10 |
| version | 1.0 |
Overview
The Metadata Change Proposal (MCP) is DataHub's fundamental unit of metadata mutation. It wraps an entity URN and an aspect instance into a standardized change proposal that can be validated, serialized, and emitted through any transport backend.
Description
The Metadata Change Proposal is the envelope that packages constructed metadata objects (URNs and aspects) into a form that the DataHub backend can process. The Python SDK provides MetadataChangeProposalWrapper as a high-level, type-safe wrapper around the lower-level MetadataChangeProposalClass.
An MCP contains the following key components:
- entityUrn -- The URN identifying the target entity (e.g., a dataset, user, or tag)
- entityType -- The type of the entity (automatically inferred from the URN if not provided)
- aspect -- The typed aspect instance carrying the metadata to be applied
- aspectName -- The name of the aspect (automatically derived from the aspect class if not provided)
- changeType -- The type of change, typically
UPSERT(create or update) - systemMetadata -- Optional system-level metadata (e.g., source pipeline information)
- auditHeader -- Optional Kafka audit header for traceability
The wrapper provides automatic inference and validation:
- entityType is inferred from the URN via
guess_entity_type() - aspectName is derived from the aspect class via
get_aspect_name() - Cross-validation ensures the manually provided entityType matches the URN-inferred type
The wrapper also provides conversion methods:
make_mcp()-- Serializes the wrapper into aMetadataChangeProposalClasswith generic (JSON-serialized) aspectsvalidate()-- Checks that the MCP is well-formed and all fields are consistentas_workunit()-- Wraps the MCP into aMetadataWorkUnitfor use in ingestion pipelines
Usage
Use Metadata Change Proposal wrapping when packaging constructed metadata objects for emission to DataHub via any transport. This is the step between constructing metadata objects (URNs and aspects) and emitting them to the backend.
Typical workflow:
- Construct URNs using builder functions (e.g.,
make_dataset_urn) - Create aspect instances (e.g.,
OwnershipClass,GlobalTagsClass) - Wrap into an MCP using
MetadataChangeProposalWrapper - Emit via an emitter (
DataHubRestEmitterorDatahubKafkaEmitter)
The construct_many class method enables efficient batch construction of multiple MCPs for the same entity.
Theoretical Basis
This principle follows the event sourcing pattern. Metadata changes are captured as discrete proposals (events) rather than direct mutations. Each MCP is an atomic, self-describing change that can be validated independently before emission.
Key design principles:
- Atomicity -- Each MCP represents a single aspect change on a single entity
- Self-description -- The MCP contains all information needed to apply the change (entity type, URN, aspect name, aspect value)
- Idempotency -- UPSERT semantics mean the same MCP can be applied multiple times with the same result
- Validation -- The wrapper ensures type consistency between URN, entity type, and aspect name before emission
- Serialization transparency -- The wrapper handles JSON serialization of aspects into the generic wire format
Related
- Implemented by: Datahub_project_Datahub_MetadataChangeProposalWrapper_Init
Implementation:Datahub_project_Datahub_MetadataChangeProposalWrapper_Init
- Depends on: Datahub_project_Datahub_Metadata_Object_Construction
- Used by: Datahub_project_Datahub_Metadata_Emission