Implementation:Datahub project Datahub HdfsPlatform
| Knowledge Sources | |
|---|---|
| Domains | OpenLineage_Integration, Platform_Detection |
| Last Updated | 2026-02-10 00:00 GMT |
Overview
Description
HdfsPlatform is a Java enum in the OpenLineage converter module that maps filesystem URI scheme prefixes to DataHub platform identifiers and provides a method to check whether a given prefix corresponds to a known filesystem platform.
The enum defines seven platform categories:
- S3 -- Amazon S3 (
s3,s3a,s3n) - GCS -- Google Cloud Storage (
gs,gcs) - ABFS -- Azure Blob File System (
abfs,abfss) - WASB -- Azure Blob Storage legacy (
wasb,wasbs) - DBFS -- Databricks File System (
dbfs) - FILE -- Local filesystem (
file) - HDFS -- Hadoop Distributed File System (default, no specific prefixes)
Each enum constant holds a list of recognized URI prefixes and the corresponding DataHub platform string.
Usage
Used by the OpenLineage dataset resolution logic to determine whether a URI prefix represents a known filesystem platform, primarily through the isFsPlatformPrefix method.
Code Reference
Source Location
metadata-integration/java/openlineage-converter/src/main/java/io/datahubproject/openlineage/dataset/HdfsPlatform.java
Signature
public enum HdfsPlatform {
S3(Arrays.asList("s3", "s3a", "s3n"), "s3"),
GCS(Arrays.asList("gs", "gcs"), "gcs"),
ABFS(Arrays.asList("abfs", "abfss"), "abs"),
WASB(Arrays.asList("wasb", "wasbs"), "abs"),
DBFS(Collections.singletonList("dbfs"), "dbfs"),
FILE(Collections.singletonList("file"), "file"),
HDFS(Collections.emptyList(), "hdfs");
public final List<String> prefixes;
public final String platform;
HdfsPlatform(List<String> prefixes, String platform)
public static boolean isFsPlatformPrefix(String prefix)
}
Import
import io.datahubproject.openlineage.dataset.HdfsPlatform;
I/O Contract
Inputs
| Method | Parameter | Type | Description |
|---|---|---|---|
isFsPlatformPrefix |
prefix |
String |
A URI scheme prefix (e.g., "s3", "gs", "abfss")
|
Outputs
| Method | Return Type | Description |
|---|---|---|
isFsPlatformPrefix |
boolean |
true if the prefix matches any known filesystem platform, false otherwise
|
Prefix to Platform Mapping:
| URI Prefix(es) | DataHub Platform String |
|---|---|
s3, s3a, s3n |
"s3"
|
gs, gcs |
"gcs"
|
abfs, abfss |
"abs"
|
wasb, wasbs |
"abs"
|
dbfs |
"dbfs"
|
file |
"file"
|
| (no prefix match) | "hdfs" (default)
|
Usage Examples
// Check if a prefix is a known filesystem platform
boolean isKnown = HdfsPlatform.isFsPlatformPrefix("s3a"); // true
boolean isUnknown = HdfsPlatform.isFsPlatformPrefix("ftp"); // false
// Access platform properties
HdfsPlatform.S3.platform; // "s3"
HdfsPlatform.S3.prefixes; // ["s3", "s3a", "s3n"]
Related Pages
- Datahub_project_Datahub_HdfsPathDataset -- Uses HdfsPlatform for dataset path resolution and platform detection