Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Lance format Lance EncodingsIo

From Leeroopedia
Revision as of 15:26, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Lance_format_Lance_EncodingsIo.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Knowledge Sources
Domains Encoding, Columnar_Data
Last Updated 2026-02-08 19:33 GMT

Overview

The EncodingsIo module defines the I/O service trait that abstracts file access for Lance encoders and decoders, along with an in-memory BufferScheduler implementation and the crate's top-level module declarations.

Description

This module (lib.rs) serves as the root of the lance-encoding crate. It re-exports the public modules and defines the core I/O abstraction that decouples encoding logic from storage backends.

EncodingsIo Trait:

The central I/O abstraction that represents a single-file reader/scheduler. Key design points:

  • Single-file scope -- Each EncodingsIo instance is bound to one file.
  • Batch requests -- submit_request accepts a Vec<Range<u64>> so the implementation can coalesce I/O.
  • Priority-based scheduling -- The priority parameter should be set to the lowest row number being delivered. This ensures indirect I/O (e.g., loading list offsets before list data) is prioritized correctly so decoding can proceed as quickly as possible.
  • Empty range handling -- Implementations must handle empty ranges and return an empty Bytes for each.
  • Convenience method -- submit_single wraps submit_request for the common single-range case.

This trait is specified so that lance-encoding does not depend on lance-io, maintaining a clean dependency boundary.

BufferScheduler:

A simple in-memory implementation of EncodingsIo that serves data from a bytes::Bytes buffer. It slices the buffer according to requested ranges and returns immediately. This is used in tests and for in-memory file operations.

Crate Modules:

The lib.rs declares the following public modules:

  • buffer, compression, compression_config, constants, data, decoder, encoder, encodings, format, previous, repdef, statistics, utils, version
  • testing (conditionally compiled under #[cfg(test)])

Platform Constraint:

The crate includes a compile-time assertion that only little-endian systems are supported. Big-endian support would require extensive testing of encoding correctness.

Usage

Use EncodingsIo when:

  • Implementing a new storage backend for Lance (S3, GCS, Azure, local filesystem)
  • Writing tests that need simulated file I/O via BufferScheduler
  • Building custom file readers that feed data into Lance decoders

Code Reference

Source Location rust/lance-encoding/src/lib.rs
Key Trait EncodingsIo
Key Struct BufferScheduler
Import use lance_encoding::{EncodingsIo, BufferScheduler};

I/O Contract

EncodingsIo Trait:

Method Input Output Description
submit_request Vec<Range<u64>>, u64 (priority) BoxFuture<'static, Result<Vec<Bytes>>> Submit batch I/O request; returns one Bytes per range
submit_single Range<u64>, u64 (priority) BoxFuture<'static, Result<Bytes>> Convenience wrapper for single range

BufferScheduler:

Method Input Output Description
new(data) Bytes BufferScheduler Create from in-memory buffer
submit_request Vec<Range<u64>>, priority BoxFuture<Result<Vec<Bytes>>> Slices buffer; returns immediately

Usage Examples

use lance_encoding::{EncodingsIo, BufferScheduler};
use bytes::Bytes;
use std::sync::Arc;

// Create an in-memory I/O scheduler from a byte buffer
let data = Bytes::from(vec![0u8; 1024]);
let scheduler = Arc::new(BufferScheduler::new(data));

// Submit a batch request for two ranges
let ranges = vec![0..100, 200..300];
let result = scheduler.submit_request(ranges, 0).await.unwrap();
assert_eq!(result.len(), 2);
assert_eq!(result[0].len(), 100);
assert_eq!(result[1].len(), 100);

// Submit a single range request
let single = scheduler.submit_single(0..50, 0).await.unwrap();
assert_eq!(single.len(), 50);

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment