Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Datahub project Datahub EntityClient Operations

From Leeroopedia
Revision as of 14:42, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Datahub_project_Datahub_EntityClient_Operations.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Knowledge Sources
Domains Java_SDK, Metadata_Management
Last Updated 2026-02-10 00:00 GMT

Overview

EntityClient is the primary operations class in the DataHub Java SDK V2 that provides create, read, update, and patch operations for DataHub entities, orchestrating the emission of metadata change proposals through a RestEmitter with version-aware patch transformation and optimistic locking retry logic.

Description

The EntityClient class is the central coordinator for all entity CRUD operations in the SDK V2. It is not meant to be instantiated directly; instead, it is obtained via DataHubClientV2.entities().

Key characteristics:

  • Upsert orchestration: The upsert() method follows a strict emission order:
    1. Cached aspects (from builder) are emitted as full UPSERT MCPs
    2. Pending MCPs (from set*() methods) are emitted as full aspect replacements
    3. Full aspect writes are awaited to completion before proceeding
    4. Pending patches (from add*/remove* methods) are transformed via the VersionAwarePatchTransformer and emitted with retry logic
  • Version-aware patch transformation: Uses a PatchTransformer (specifically VersionAwarePatchTransformer) that consults the server configuration to determine whether to emit patches natively or convert them to full aspect replacements for older servers.
  • Optimistic locking: The emitWithRetry() method detects HTTP 422 version conflict responses (pattern: "Expected version X, actual version Y") and retries up to 3 times with exponential backoff (100ms, 200ms, 400ms) using the TransformResult's retry function to fetch fresh data.
  • Aspect retrieval: The getAspect() method fetches a single aspect with system metadata (including version information) from the /entitiesV2/ endpoint. Supports optimistic locking via returned version strings.
  • Batch aspect fetch: The private fetchAspects() method retrieves multiple aspects in a single API call, parsing them from the /entitiesV2/ response.
  • Entity get: The get() methods retrieve entities by URN with either default or specified aspects. They use reflection-based createEntityInstance() to construct entity objects from URN and aspect maps.
  • Aspect name resolution: Uses a 3-tier strategy: (1) ASPECT_NAME static field, (2) Pegasus @Aspect.name schema annotation, (3) camelCase fallback.
  • Placeholder methods: delete() and exists() are declared but throw UnsupportedOperationException (not yet implemented).

Usage

Use EntityClient as the primary interface for all entity operations in the DataHub Java SDK V2. Access it via client.entities() from a DataHubClientV2 instance. It handles the complexity of patch transformation, version conflict resolution, and ordered emission behind a simple upsert(entity) API.

Code Reference

Source Location

Signature

@Slf4j
public class EntityClient {

    // Constructor (package-private, use DataHubClientV2.entities())
    public EntityClient(RestEmitter emitter, DataHubClientConfigV2 config, DataHubClientV2 client);

    // Write operations
    public <T extends Entity> void upsert(T entity)
        throws IOException, ExecutionException, InterruptedException;
    public void delete(String urn);                     // Not yet implemented
    public boolean exists(String urn);                  // Not yet implemented

    // Read operations
    public <T extends Entity> T get(String urn, Class<T> entityClass)
        throws IOException, ExecutionException, InterruptedException;
    public <T extends Entity> T get(String urn, Class<T> entityClass,
        List<Class<? extends RecordTemplate>> aspects)
        throws IOException, ExecutionException, InterruptedException;
    public <T extends RecordTemplate> AspectWithMetadata<T> getAspect(Urn urn, Class<T> aspectClass)
        throws IOException, ExecutionException, InterruptedException;
}

Import

import datahub.client.v2.operations.EntityClient;

I/O Contract

Upsert Input

Parameter Type Description
entity T extends Entity Any SDK V2 entity (Dataset, Chart, Dashboard, etc.) with cached aspects and/or pending patches

Upsert Emission Order

Step Source MCP Type Description
1 entity.toMCPs() Full UPSERT Cached aspects from the builder (initial creation)
2 entity.getPendingMCPs() Full UPSERT Full aspect replacements from set*() methods
3 await all -- Wait for steps 1-2 to complete successfully
4 entity.getPendingPatches() PATCH or UPSERT Accumulated patches, possibly transformed for server compatibility

Get Outputs

Method Return Type Description
get(urn, class) T extends Entity Entity with default aspects loaded from server (read-only)
get(urn, class, aspects) T extends Entity Entity with specified aspects loaded from server (read-only)
getAspect(urn, class) AspectWithMetadata<T> Single aspect with system metadata and version for optimistic locking

Retry Behavior

Attempt Backoff Delay Action
1 0ms Initial emit
2 100ms Re-read + re-apply patch + emit
3 200ms Re-read + re-apply patch + emit
4 400ms Re-read + re-apply patch + emit
Fail -- Throw IOException

Usage Examples

// Obtain EntityClient from DataHubClientV2
DataHubClientV2 client = DataHubClientV2.builder()
    .server("http://localhost:8080")
    .token("your-token")
    .build();
EntityClient entities = client.entities();

// Upsert an entity
Dataset dataset = Dataset.builder()
    .platform("snowflake")
    .name("users")
    .build();
dataset.addTag("pii");
dataset.addOwner("urn:li:corpuser:admin", OwnershipType.TECHNICAL_OWNER);
entities.upsert(dataset);

// Read an entity with default aspects
Chart chart = entities.get("urn:li:chart:(looker,my_chart)", Chart.class);
String title = chart.getTitle();

// Read an entity with specific aspects
Dashboard dashboard = entities.get(
    "urn:li:dashboard:(looker,my_dashboard)",
    Dashboard.class,
    List.of(DashboardInfo.class, Ownership.class)
);

// Read a single aspect with version metadata
AspectWithMetadata<Ownership> ownership = entities.getAspect(
    Urn.createFromString("urn:li:dataset:(urn:li:dataPlatform:snowflake,db.table,PROD)"),
    Ownership.class
);
String version = ownership.getVersion();

// Mutate a read-only entity
Chart mutableChart = chart.mutable();
mutableChart.addTag("verified");
entities.upsert(mutableChart);

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment