Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Datahub project Datahub ProtobufExtensionFieldVisitor

From Leeroopedia


Knowledge Sources
Domains Protobuf_Integration, Schema_Field_Metadata
Last Updated 2026-02-10 00:00 GMT

Overview

Description

ProtobufExtensionFieldVisitor extends SchemaFieldVisitor to produce DataHub SchemaField objects enriched with metadata extracted from protobuf extension options. For each protobuf field, it:

  1. Detects primary keys -- Checks field options for any descriptor name matching the pattern primary_key (case-insensitive).
  2. Extracts tags -- Collects TagAssociation objects from both field-level options and promoted message-level options (from nested message types referenced by the field).
  3. Extracts glossary terms -- Collects GlossaryTermAssociation objects similarly from field and promoted message options.
  4. Builds SchemaField -- Constructs the SchemaField with field path, nullability (inverse of primary key), description (including enum value documentation), native data type, global tags, and glossary terms.
  5. Handles enum descriptions -- For enum-typed fields, appends a formatted listing of enum values and their comments to the field description.

The visitor processes all graph paths to each field, producing a pair of (SchemaField, sortOrder) for each path.

Usage

Used as the primary field visitor during protobuf schema conversion to generate the SchemaField array for the schemaMetadata aspect of a DataHub dataset.

Code Reference

Source Location

metadata-integration/java/datahub-protobuf/src/main/java/datahub/protobuf/visitors/field/ProtobufExtensionFieldVisitor.java

Signature

public class ProtobufExtensionFieldVisitor extends SchemaFieldVisitor {

    @Override
    public Stream<Pair<SchemaField, Double>> visitField(
        ProtobufField field, VisitContext context)
}

Import

import datahub.protobuf.visitors.field.ProtobufExtensionFieldVisitor;

I/O Contract

Inputs

Parameter Type Description
field ProtobufField The protobuf field being visited
context VisitContext Visit context providing the graph, registry, audit stamp, and path computation

Outputs

Return Type Description
Stream<Pair<SchemaField, Double>> Pairs of constructed SchemaField objects and their computed sort order values (one pair per graph path to the field)

Each SchemaField contains:

  • fieldPath -- computed from the graph path
  • nullable -- true unless the field is a primary key
  • isPartOfKey -- true if primary key option is present
  • description -- field comment with appended enum values if applicable
  • nativeDataType -- the protobuf native type string
  • type -- the DataHub schema field data type
  • globalTags -- tags from field options and promoted message options
  • glossaryTerms -- terms from field options and promoted message options

Usage Examples

ProtobufExtensionFieldVisitor fieldVisitor = new ProtobufExtensionFieldVisitor();

// Used within ProtobufGraph traversal
Stream<Pair<SchemaField, Double>> fieldResults =
    fieldVisitor.visitField(protobufField, visitContext);

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment