Implementation:Lance format Lance PrimitiveBlobDecoding
| Knowledge Sources | |
|---|---|
| Domains | Encoding, Compression |
| Last Updated | 2026-02-08 19:33 GMT |
Overview
BlobDescriptionPageScheduler and BlobPageScheduler are decoding components that schedule and decode blob data stored out-of-line in Lance files, using inline descriptors (position, size) to locate blob payloads.
Description
This module contains the decoding side of blob encoding. The blob structural encoding stores actual blob values out-of-line in the file, with only descriptors (position and size as u64 pairs) stored in the page. The decoding pipeline has two key components:
- BlobDescriptionPageScheduler: Wraps an inner page scheduler to decode the descriptor struct into position/size pairs and rep/def information. It extracts position and size from the decoded struct data block and produces an intermediate decoded page.
- BlobPageScheduler: Uses the decoded descriptions to schedule I/O reads for the actual blob data from external file positions. It groups blob reads into shards of approximately 32 MiB (
TARGET_SHARD_SIZE) to manage memory usage, then assembles the results intoVariableWidthBlockdata.
Usage
These components are used internally by the decoding framework when reading blob columns. They are not typically instantiated directly by users.
Code Reference
| Source Location | Repository: lance-format/lance, File: rust/lance-encoding/src/encodings/logical/primitive/blob.rs, Lines: 1-527
|
|---|---|
| Signature |
pub(super) struct BlobDescriptionPageScheduler {
inner_scheduler: Box<dyn StructuralPageScheduler>,
def_meaning: Arc<[DefinitionInterpretation]>,
}
impl BlobDescriptionPageScheduler {
pub fn new(
inner_scheduler: Box<dyn StructuralPageScheduler>,
def_meaning: Arc<[DefinitionInterpretation]>,
) -> Self;
}
pub const TARGET_SHARD_SIZE: u64 = 32 * 1024 * 1024;
|
| Import | use lance_encoding::encodings::logical::primitive::blob::BlobDescriptionPageScheduler; (crate-internal)
|
I/O Contract
| Direction | Type | Description |
|---|---|---|
| Input | Encoded page with blob descriptors | Struct data block containing position (u64) and size (u64) per blob |
| Input | EncodingsIo |
I/O interface for reading blob data from external file positions |
| Output | DecodedPage |
Intermediate decoded descriptions with positions and sizes |
| Output | DataBlock::VariableWidth |
Final blob data assembled into a variable-width block |
Usage Examples
// BlobDescriptionPageScheduler is used internally by the primitive field scheduler
// when it encounters a blob page layout:
//
// let scheduler = BlobDescriptionPageScheduler::new(
// inner_page_scheduler,
// def_meaning,
// );
// let tasks = scheduler.schedule_ranges(&ranges, &io)?;
Related Pages
- Lance_format_Lance_BlobEncoding - Encoding side of blob data
- Lance_format_Lance_PrimitiveEncoding - Parent primitive encoding framework
- Lance_format_Lance_PackedEncoding - Packed struct used for descriptors