Implementation:Datahub project Datahub ProtobufExtensionFieldVisitor
| Knowledge Sources | |
|---|---|
| Domains | Protobuf_Integration, Schema_Field_Metadata |
| Last Updated | 2026-02-10 00:00 GMT |
Overview
Description
ProtobufExtensionFieldVisitor extends SchemaFieldVisitor to produce DataHub SchemaField objects enriched with metadata extracted from protobuf extension options. For each protobuf field, it:
- Detects primary keys -- Checks field options for any descriptor name matching the pattern
primary_key(case-insensitive). - Extracts tags -- Collects
TagAssociationobjects from both field-level options and promoted message-level options (from nested message types referenced by the field). - Extracts glossary terms -- Collects
GlossaryTermAssociationobjects similarly from field and promoted message options. - Builds SchemaField -- Constructs the
SchemaFieldwith field path, nullability (inverse of primary key), description (including enum value documentation), native data type, global tags, and glossary terms. - Handles enum descriptions -- For enum-typed fields, appends a formatted listing of enum values and their comments to the field description.
The visitor processes all graph paths to each field, producing a pair of (SchemaField, sortOrder) for each path.
Usage
Used as the primary field visitor during protobuf schema conversion to generate the SchemaField array for the schemaMetadata aspect of a DataHub dataset.
Code Reference
Source Location
metadata-integration/java/datahub-protobuf/src/main/java/datahub/protobuf/visitors/field/ProtobufExtensionFieldVisitor.java
Signature
public class ProtobufExtensionFieldVisitor extends SchemaFieldVisitor {
@Override
public Stream<Pair<SchemaField, Double>> visitField(
ProtobufField field, VisitContext context)
}
Import
import datahub.protobuf.visitors.field.ProtobufExtensionFieldVisitor;
I/O Contract
Inputs
| Parameter | Type | Description |
|---|---|---|
field |
ProtobufField |
The protobuf field being visited |
context |
VisitContext |
Visit context providing the graph, registry, audit stamp, and path computation |
Outputs
| Return Type | Description |
|---|---|
Stream<Pair<SchemaField, Double>> |
Pairs of constructed SchemaField objects and their computed sort order values (one pair per graph path to the field)
|
Each SchemaField contains:
fieldPath-- computed from the graph pathnullable--trueunless the field is a primary keyisPartOfKey--trueif primary key option is presentdescription-- field comment with appended enum values if applicablenativeDataType-- the protobuf native type stringtype-- the DataHub schema field data typeglobalTags-- tags from field options and promoted message optionsglossaryTerms-- terms from field options and promoted message options
Usage Examples
ProtobufExtensionFieldVisitor fieldVisitor = new ProtobufExtensionFieldVisitor();
// Used within ProtobufGraph traversal
Stream<Pair<SchemaField, Double>> fieldResults =
fieldVisitor.visitField(protobufField, visitContext);
Related Pages
- Datahub_project_Datahub_ProtobufExtensionUtil -- Provides tag and term extraction from options
- Datahub_project_Datahub_ProtobufDescriptorUtils -- Provides getFieldOptions and getMessageOptions
- Datahub_project_Datahub_ProtobufDatasetVisitor -- Top-level visitor that composes field visitors
- Datahub_project_Datahub_TagVisitor -- Related visitor for dataset-level tag creation