Implementation:Lance format Lance Java OpenDatasetBuilder
| Knowledge Sources | |
|---|---|
| Domains | Java_SDK, Dataset_Management |
| Last Updated | 2026-02-08 19:33 GMT |
Overview
Description
OpenDatasetBuilder provides a fluent builder API for opening Lance datasets. It supports two mutually exclusive modes of dataset access: direct URI-based opening and namespace-based opening. When using a namespace, the builder automatically fetches table location and storage options via the LanceNamespace.describeTable() call. If no BufferAllocator is provided, the builder creates a self-managed RootAllocator with maximum capacity.
Usage
Instances are created via Dataset.open() (package-private constructor). The builder validates that exactly one of URI or namespace+tableId is provided, and throws IllegalArgumentException with descriptive messages for invalid configurations. The build() method returns a fully initialized Dataset instance.
Code Reference
Source Location
java/src/main/java/org/lance/OpenDatasetBuilder.java
Signature
public class OpenDatasetBuilder {
public OpenDatasetBuilder allocator(BufferAllocator allocator);
public OpenDatasetBuilder uri(String uri);
public OpenDatasetBuilder namespace(LanceNamespace namespace);
public OpenDatasetBuilder tableId(List<String> tableId);
public OpenDatasetBuilder readOptions(ReadOptions options);
public Dataset build();
}
Import
import org.lance.OpenDatasetBuilder;
I/O Contract
| Method | Parameter Type | Description |
|---|---|---|
| allocator() | BufferAllocator |
Sets the Arrow buffer allocator; if omitted, a RootAllocator is created |
| uri() | String |
Sets the dataset URI (e.g., s3://bucket/table.lance); mutually exclusive with namespace+tableId
|
| namespace() | LanceNamespace |
Sets the namespace for location resolution; requires tableId |
| tableId() | List<String> |
Sets the table identifier; requires namespace |
| readOptions() | ReadOptions |
Sets read configuration (version, cache sizes, storage options) |
| Method | Return Type | Description |
|---|---|---|
| build() | Dataset |
Opens and returns the dataset; throws IllegalArgumentException on invalid configuration |
| Exception | Condition |
|---|---|
IllegalArgumentException |
Both URI and namespace+tableId are specified |
IllegalArgumentException |
Neither URI nor namespace+tableId is specified |
IllegalArgumentException |
namespace is set without tableId, or vice versa |
IllegalArgumentException |
Namespace describeTable returns null or empty location |
Usage Examples
import org.lance.Dataset;
import org.lance.ReadOptions;
import org.apache.arrow.memory.RootAllocator;
import java.util.Arrays;
// Open a dataset by URI with default options
Dataset dataset = Dataset.open()
.uri("s3://bucket/table.lance")
.build();
// Open a dataset by URI with custom allocator and read options
ReadOptions options = new ReadOptions.Builder()
.setVersion(5)
.setIndexCacheSizeBytes(2L * 1024 * 1024 * 1024)
.build();
Dataset dataset = Dataset.open()
.allocator(new RootAllocator())
.uri("file:///data/table.lance")
.readOptions(options)
.build();
// Open a dataset via namespace
Dataset dataset = Dataset.open()
.namespace(myNamespace)
.tableId(Arrays.asList("my_table"))
.build();
Related Pages
- Lance_format_Lance_Java_ReadOptions - Configuration options passed to the builder for controlling read behavior
- Lance_format_Lance_Java_Ref - Version references that can be used when setting read options