Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Lance format Lance PrimitiveBlobDecoding

From Leeroopedia


Knowledge Sources
Domains Encoding, Compression
Last Updated 2026-02-08 19:33 GMT

Overview

BlobDescriptionPageScheduler and BlobPageScheduler are decoding components that schedule and decode blob data stored out-of-line in Lance files, using inline descriptors (position, size) to locate blob payloads.

Description

This module contains the decoding side of blob encoding. The blob structural encoding stores actual blob values out-of-line in the file, with only descriptors (position and size as u64 pairs) stored in the page. The decoding pipeline has two key components:

  • BlobDescriptionPageScheduler: Wraps an inner page scheduler to decode the descriptor struct into position/size pairs and rep/def information. It extracts position and size from the decoded struct data block and produces an intermediate decoded page.
  • BlobPageScheduler: Uses the decoded descriptions to schedule I/O reads for the actual blob data from external file positions. It groups blob reads into shards of approximately 32 MiB (TARGET_SHARD_SIZE) to manage memory usage, then assembles the results into VariableWidthBlock data.

Usage

These components are used internally by the decoding framework when reading blob columns. They are not typically instantiated directly by users.

Code Reference

Source Location Repository: lance-format/lance, File: rust/lance-encoding/src/encodings/logical/primitive/blob.rs, Lines: 1-527
Signature
pub(super) struct BlobDescriptionPageScheduler {
    inner_scheduler: Box<dyn StructuralPageScheduler>,
    def_meaning: Arc<[DefinitionInterpretation]>,
}

impl BlobDescriptionPageScheduler {
    pub fn new(
        inner_scheduler: Box<dyn StructuralPageScheduler>,
        def_meaning: Arc<[DefinitionInterpretation]>,
    ) -> Self;
}

pub const TARGET_SHARD_SIZE: u64 = 32 * 1024 * 1024;
Import use lance_encoding::encodings::logical::primitive::blob::BlobDescriptionPageScheduler; (crate-internal)

I/O Contract

Direction Type Description
Input Encoded page with blob descriptors Struct data block containing position (u64) and size (u64) per blob
Input EncodingsIo I/O interface for reading blob data from external file positions
Output DecodedPage Intermediate decoded descriptions with positions and sizes
Output DataBlock::VariableWidth Final blob data assembled into a variable-width block

Usage Examples

// BlobDescriptionPageScheduler is used internally by the primitive field scheduler
// when it encounters a blob page layout:
//
// let scheduler = BlobDescriptionPageScheduler::new(
//     inner_page_scheduler,
//     def_meaning,
// );
// let tasks = scheduler.schedule_ranges(&ranges, &io)?;

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment