Implementation:Lance format Lance LegacyDecoder
| Knowledge Sources | |
|---|---|
| Domains | Encoding, Legacy_Format |
| Last Updated | 2026-02-08 19:33 GMT |
Overview
The legacy decoder module defines the core traits for scheduling I/O and decoding data written in the Lance v2.0 file format.
Description
⚠️ DEPRECATED: This is legacy code from the Lance v1/v2.0 format, retained only for backward compatibility. See Lance_format_Lance_Warning_Deprecated_Legacy_Encodings.
This module provides the foundational decoding infrastructure for the legacy (v2.0) Lance file format. It defines three primary traits: FieldScheduler for scheduling I/O at the field level, SchedulingJob for tracking the progress of scheduled I/O operations, and LogicalPageDecoder for decoding loaded data into Arrow arrays. The FieldScheduler is stateless and must be Send + Sync since it may be shared across tasks (e.g., list pages share item schedulers). The LogicalPageDecoder is stateful and only Send, managing the lifecycle of loaded data from waiting for I/O completion through draining decoded rows. The module also defines DecoderReady, which pairs a decoder with a path describing its position in the schema hierarchy.
Usage
Use this module when reading Lance files written in the v2.0 format. A FieldScheduler is created per output field and calculates the necessary I/O. It emits LogicalPageDecoder instances in row-major order. Consumers call wait_for_loaded to ensure data is available, then drain to extract decoded Arrow arrays.
Code Reference
Source Location
rust/lance-encoding/src/previous/decoder.rs
Signature
pub trait SchedulingJob: std::fmt::Debug {
fn schedule_next(
&mut self,
context: &mut SchedulerContext,
priority: &dyn PriorityRange,
) -> Result<ScheduledScanLine>;
fn num_rows(&self) -> u64;
}
pub trait FieldScheduler: Send + Sync + std::fmt::Debug {
fn initialize<'a>(
&'a self,
filter: &'a FilterExpression,
context: &'a SchedulerContext,
) -> BoxFuture<'a, Result<()>>;
fn schedule_ranges<'a>(
&'a self,
ranges: &[Range<u64>],
filter: &FilterExpression,
) -> Result<Box<dyn SchedulingJob + 'a>>;
fn num_rows(&self) -> u64;
}
pub trait LogicalPageDecoder: std::fmt::Debug + Send {
fn accept_child(&mut self, child: DecoderReady) -> Result<()>;
fn wait_for_loaded(&mut self, loaded_need: u64) -> BoxFuture<'_, Result<()>>;
fn rows_loaded(&self) -> u64;
fn num_rows(&self) -> u64;
fn rows_drained(&self) -> u64;
fn drain(&mut self, num_rows: u64) -> Result<NextDecodeTask>;
fn data_type(&self) -> &DataType;
}
Import
use lance_encoding::previous::decoder::{
FieldScheduler, LogicalPageDecoder, SchedulingJob, DecoderReady,
};
I/O Contract
| Input | Type | Description |
|---|---|---|
| ranges | &[Range<u64>] |
Row ranges to schedule for reading (ordered, non-overlapping) |
| filter | &FilterExpression |
Filter expression for predicate pushdown |
| loaded_need | u64 |
Minimum number of rows that must be loaded before decoding |
| num_rows | u64 |
Number of rows to drain from the decoder |
| Output | Type | Description |
|---|---|---|
| scheduling_job | Box<dyn SchedulingJob> |
Job that emits decoders as I/O completes |
| scan_line | ScheduledScanLine |
Collection of decoder-ready messages with row counts |
| decode_task | NextDecodeTask |
Task containing the decode operation and row count |
Usage Examples
use lance_encoding::previous::decoder::{FieldScheduler, LogicalPageDecoder};
// Given a field scheduler for a column
let scheduler: Arc<dyn FieldScheduler> = /* obtained from file reader */;
// Schedule ranges for reading
let ranges = vec![0..100, 200..300];
let filter = FilterExpression::no_filter();
let mut job = scheduler.schedule_ranges(&ranges, &filter)?;
// Process scheduled scan lines
let scan_line = job.schedule_next(&mut context, &priority)?;
for message in scan_line.decoders {
let decoder_ready = message.into_legacy();
let mut decoder = decoder_ready.decoder;
// Wait for data to load
decoder.wait_for_loaded(decoder.num_rows()).await?;
// Drain decoded data
let task = decoder.drain(decoder.num_rows())?;
let array = task.task.decode()?;
}
Related Pages
- Lance_format_Lance_LegacyEncoder - Corresponding legacy encoding infrastructure
- Lance_format_Lance_LegacyPrimitiveEncoding - Primitive field scheduler/decoder
- Lance_format_Lance_LegacyStructEncoding - Struct field scheduler/decoder
- Lance_format_Lance_LegacyListEncoding - List field scheduler/decoder
- Lance_format_Lance_LegacyBlobEncoding - Blob field scheduler/decoder
- Heuristic:Lance_format_Lance_Warning_Deprecated_Legacy_Encodings