Implementation:NVIDIA DALI URI Parser
| Knowledge Sources | |
|---|---|
| Domains | Utilities, File_IO |
| Last Updated | 2026-02-08 16:00 GMT |
Overview
Defines the URI class for parsing Uniform Resource Identifier strings into their component parts: scheme, authority, path, query, and fragment.
Description
The URI parser in dali/util/uri.h implements a standards-aware URI parser as the URI class within the dali namespace. The class stores the original URI string along with start and end offsets for each of the five standard URI components: scheme, authority, path, query, and fragment. Parsing is performed by the static factory method URI::Parse, which accepts the URI string and an optional ParseOpts bitmask to control parsing behavior.
The ParseOpts enum currently defines Default (strict parsing) and AllowNonEscaped (permissive mode that accepts unescaped characters like spaces, useful for pre-validation before percent-encoding). The parsed state includes a valid_ flag and an err_msg_ string that describes any parsing failure. Accessor methods (scheme(), authority(), path(), query(), fragment()) return std::string_view references into the stored URI string, enforcing validity on access. Composite accessors scheme_authority(), scheme_authority_path(), and path_and_query() return combined views for common access patterns.
The class uses offset-based storage rather than substring copies, making it efficient for repeated component access without additional memory allocation. All accessor methods call enforce_valid() internally, throwing a std::runtime_error with a descriptive message if the URI was not successfully parsed.
Usage
Use the URI class when you need to decompose a URI string into its standard components for routing, protocol selection, or path extraction. This is used internally by DALI's file I/O subsystem to determine the storage backend (local filesystem, S3, GCS, etc.) based on the URI scheme.
Code Reference
Source Location
- Repository: NVIDIA_DALI
- File: dali/util/uri.h
- Lines: 1-107
Signature
namespace dali {
class URI {
public:
enum ParseOpts : uint32_t {
Default = 0,
AllowNonEscaped = 1 << 0,
};
static DLL_PUBLIC URI Parse(std::string uri, ParseOpts opts = ParseOpts::Default);
bool valid() const;
std::string_view scheme() const;
std::string_view authority() const;
std::string_view scheme_authority() const;
std::string_view path() const;
std::string_view scheme_authority_path() const;
std::string_view query() const;
std::string_view path_and_query() const;
std::string_view fragment() const;
private:
std::string uri_;
std::ptrdiff_t scheme_start_, scheme_end_;
std::ptrdiff_t authority_start_, authority_end_;
std::ptrdiff_t path_start_, path_end_;
std::ptrdiff_t query_start_, query_end_;
std::ptrdiff_t fragment_start_, fragment_end_;
bool valid_ = false;
std::string err_msg_;
};
} // namespace dali
Import
#include "dali/util/uri.h"
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| uri | std::string |
Yes (Parse) | URI string to parse (e.g., s3://bucket/path?query#frag)
|
| opts | ParseOpts |
No (Parse) | Parsing options bitmask; default is strict mode |
Outputs
| Name | Type | Description |
|---|---|---|
| return value (Parse) | URI |
Parsed URI object with component accessors |
| valid() | bool |
True if the URI was successfully parsed |
| scheme() | std::string_view |
URI scheme (e.g., "s3", "https", "file") |
| authority() | std::string_view |
URI authority component (e.g., hostname) |
| path() | std::string_view |
URI path component |
| query() | std::string_view |
URI query string (after '?') |
| fragment() | std::string_view |
URI fragment (after '#') |
Usage Examples
Parsing an S3 URI
#include "dali/util/uri.h"
auto uri = dali::URI::Parse("s3://my-bucket/datasets/train/images/");
if (uri.valid()) {
auto scheme = uri.scheme(); // "s3"
auto authority = uri.authority(); // "my-bucket"
auto path = uri.path(); // "/datasets/train/images/"
}
Parsing an HTTP URI with Query and Fragment
#include "dali/util/uri.h"
auto uri = dali::URI::Parse("https://example.com/api/data?format=json#section1");
if (uri.valid()) {
auto scheme = uri.scheme(); // "https"
auto authority = uri.authority(); // "example.com"
auto path = uri.path(); // "/api/data"
auto query = uri.query(); // "format=json"
auto fragment = uri.fragment(); // "section1"
auto full_path = uri.path_and_query(); // "/api/data?format=json"
}
Using Permissive Parsing
#include "dali/util/uri.h"
// Allow unescaped characters for pre-validation
auto uri = dali::URI::Parse("file:///path/with spaces/file.txt",
dali::URI::AllowNonEscaped);
if (uri.valid()) {
auto path = uri.path(); // "/path/with spaces/file.txt"
}