Implementation:Lance format Lance LanceCrateRoot
| Knowledge Sources | |
|---|---|
| Domains | Core, Infrastructure |
| Last Updated | 2026-02-08 19:33 GMT |
Overview
Description
The CrateRoot module (lib.rs) is the top-level entry point for the lance crate, the main Lance library. It provides:
- Crate documentation with examples for creating and scanning Lance datasets
- Module declarations exposing the public API surface:
arrow,blob,datafusion,dataset,index,io,session,table,utils - Key re-exports:
lance_core::datatypes,lance_core::Error,lance_core::Resultblob_fieldandBlobArrayBuilderfrom the blob moduleDatasetfrom the dataset module
- Convenience function
open_datasetfor loading datasets from a URI - DIST_FIELD -- A lazily initialized Arrow field for distance column metadata
- deps module -- Re-exports of
arrow_array,arrow_schema, anddatafusionto pin dependency versions for downstream users
The crate is described as providing 100x faster random access compared to Parquet, with automatic versioning, Apache Arrow and DuckDB compatibility, and optimizations for computer vision, bioinformatics, spatial, and ML data.
Usage
This is the primary entry point for Rust consumers of the Lance format. Users typically start by calling Dataset::open or open_dataset and then use the scanner, index, or write APIs.
Code Reference
Source Location
rust/lance/src/lib.rs
Signature
pub use lance_core::datatypes;
pub use lance_core::{Error, Result};
pub use blob::{blob_field, BlobArrayBuilder};
pub use dataset::Dataset;
pub async fn open_dataset<T: AsRef<str>>(table_uri: T) -> Result<Dataset>;
pub static DIST_FIELD: LazyLock<arrow_schema::Field>;
pub mod deps {
pub use arrow_array;
pub use arrow_schema;
pub use datafusion;
}
Import
use lance::{Dataset, open_dataset, Error, Result};
use lance::blob::{blob_field, BlobArrayBuilder};
use lance::deps::{arrow_array, arrow_schema};
I/O Contract
Inputs
| Parameter | Type | Description |
|---|---|---|
| table_uri | T: AsRef<str> |
URI or file path to a Lance dataset (supports local, S3, GCS, Azure) |
Outputs
| Type | Description |
|---|---|
Result<Dataset> |
An opened Lance Dataset ready for scanning, indexing, or writing
|
DIST_FIELD |
A static Field::new("_distance", Float32, true) for distance column metadata
|
Usage Examples
use std::sync::Arc;
use arrow_array::{RecordBatch, RecordBatchIterator};
use arrow_schema::{Schema, Field, DataType};
use lance::{Dataset, dataset::WriteParams};
// Create a dataset
let schema = Arc::new(Schema::new(vec![
Field::new("id", DataType::Int64, false),
]));
let batches = vec![RecordBatch::new_empty(schema.clone())];
let reader = RecordBatchIterator::new(batches.into_iter().map(Ok), schema);
Dataset::write(reader, "/tmp/my_dataset.lance", Some(WriteParams::default()))
.await
.unwrap();
// Open and scan
let dataset = lance::open_dataset("/tmp/my_dataset.lance").await.unwrap();
let mut scanner = dataset.scan();
Related Pages
- Lance_format_Lance_BlobBuilder -- Blob v2 builder re-exported from this crate root
- Lance_format_Lance_LanceTableProvider -- DataFusion integration module
- Lance_format_Lance_SessionCaches -- Session caching infrastructure
- Lance_format_Lance_LqCli -- CLI binary built from the same crate