Implementation:Datahub project Datahub ProtobufField
| Knowledge Sources | |
|---|---|
| Domains | Protobuf_Integration |
| Last Updated | 2026-02-10 00:00 GMT |
Overview
Represents a single field within a protobuf message, providing type mapping to DataHub SchemaFieldDataType, field path generation, and source comment extraction from protobuf source code locations.
Description
ProtobufField implements the ProtobufElement interface and models a field within a protobuf message definition. It is built using Lombok's @Builder pattern and wraps a FieldDescriptorProto along with its parent ProtobufMessage.
Key responsibilities:
- Type Mapping -- The
schemaFieldDataType()method maps protobuf types (TYPE_DOUBLE, TYPE_STRING, TYPE_MESSAGE, TYPE_ENUM, etc.) to DataHub schema types (NumberType, StringType, RecordType, EnumType, etc.). Repeated fields are mapped toArrayType. - Field Path Generation -- The
fieldPathType()method produces DataHub field path type tokens (e.g.,[type=string],[type=array].[type=int]) used in the schema field path specification. - Native Type Resolution -- The
nativeType()method extracts the protobuf type name, returning the type keyword for primitives (e.g.,int32) or the fully qualified message name for message types. - Comment Extraction -- The
comment()method traverses protobufSourceCodeInfo.Locationentries to find documentation comments associated with this field, supporting both top-level and nested type fields. - Enum Support -- Methods
isEnum(),getEnumDescriptor(),getEnumValues(), andgetEnumValuesWithComments()provide access to enum type information and associated comments. - OneOf Support -- The
oneOfProto()method returns theOneofDescriptorProtoif the field participates in a oneof group.
Equality is based on the fully qualified field name (parentMessage.fullName + "." + fieldName).
Usage
Use ProtobufField as a vertex in the ProtobufGraph to represent individual fields. It is created during graph construction and visited by SchemaFieldVisitor and ProtobufExtensionFieldVisitor to produce DataHub SchemaField entries.
Code Reference
Source Location
- Repository: Datahub_project_Datahub
- File: metadata-integration/java/datahub-protobuf/src/main/java/datahub/protobuf/model/ProtobufField.java
Signature
@Builder(toBuilder = true)
@Getter
@AllArgsConstructor
public class ProtobufField implements ProtobufElement {
// Core accessors
public String name();
public String fullName();
public String nativeType();
public String fieldPathType();
public boolean isMessage();
public int sortWeight();
public int getNumber();
// Type information
public SchemaFieldDataType schemaFieldDataType();
public OneofDescriptorProto oneOfProto();
public boolean isEnum();
public Optional<EnumDescriptorProto> getEnumDescriptor();
public List<EnumValueDescriptorProto> getEnumValues();
public Map<String, String> getEnumValuesWithComments();
// Source code metadata
public String comment();
public Stream<SourceCodeInfo.Location> messageLocations();
// Visitor pattern
public <T> Stream<T> accept(ProtobufModelVisitor<T> visitor, VisitContext context);
}
Import
import datahub.protobuf.model.ProtobufField;
I/O Contract
| Input | Type | Description |
|---|---|---|
| protobufMessage | ProtobufMessage |
Parent message containing this field |
| fieldProto | FieldDescriptorProto |
The raw protobuf field descriptor |
| isNestedType | Boolean |
Whether this field belongs to a nested message type |
| Output | Type | Description |
|---|---|---|
| schemaFieldDataType | SchemaFieldDataType |
DataHub schema field data type mapping |
| fieldPathType | String |
Field path type token (e.g., [type=string])
|
| comment | String |
Extracted documentation comment from proto source |
Usage Examples
ProtobufField field = ProtobufField.builder()
.protobufMessage(parentMessage)
.fieldProto(fieldDescriptorProto)
.isNestedType(false)
.build();
// Get DataHub schema field type
SchemaFieldDataType dataType = field.schemaFieldDataType();
// Get field path type for schema field path specification
String pathType = field.fieldPathType(); // e.g., "[type=string]"
// Check if field is an enum and extract values
if (field.isEnum()) {
Map<String, String> enumValues = field.getEnumValuesWithComments();
}