Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Lance format Lance LegacyFixedSizeBinaryEncoding

From Leeroopedia
Revision as of 15:28, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Lance_format_Lance_LegacyFixedSizeBinaryEncoding.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Knowledge Sources
Domains Encoding, Legacy_Format
Last Updated 2026-02-08 19:33 GMT

Overview

The legacy fixed-size binary encoding stores fixed-width binary and string data by treating each value as a uniform-length byte sequence in the Lance v2.0 format.

Description

⚠️ DEPRECATED: This is legacy code from the Lance v1/v2.0 format, retained only for backward compatibility. See Lance_format_Lance_Warning_Deprecated_Legacy_Encodings.

This module implements fixed-size binary encoding for the legacy (v2.0) Lance file format. FixedSizeBinaryPageScheduler expands row ranges into byte ranges based on a known byte_width, delegates to an inner bytes scheduler, and constructs a FixedSizeBinaryDecoder that generates synthetic offsets during decode. The decoder produces VariableWidthBlock output with computed offsets (either 32-bit or 64-bit depending on the data type), making the output compatible with Arrow's binary and UTF-8 array layouts. The FixedSizeBinaryEncoder converts variable-width input to fixed-width representation by stripping offsets and encoding the raw bytes with a nested ArrayEncoder. This encoding is used as an optimization when all binary or string values in a page have the same length.

Usage

Use this encoding when binary or string data has a uniform byte width per value. The CoreArrayEncodingStrategy selects this encoding for FixedSizeBinary types, and the physical dispatch creates FixedSizeBinaryPageScheduler when encountering a FixedSizeBinary protobuf encoding. It supports both regular and large binary/UTF-8 types via configurable bytes_per_offset (4 or 8).

Code Reference

Source Location

rust/lance-encoding/src/previous/encodings/physical/fixed_size_binary.rs

Signature

pub struct FixedSizeBinaryPageScheduler {
    bytes_scheduler: Box<dyn PageScheduler>,
    byte_width: u32,
    bytes_per_offset: u32,
}

impl FixedSizeBinaryPageScheduler {
    pub fn new(
        bytes_scheduler: Box<dyn PageScheduler>,
        byte_width: u32,
        bytes_per_offset: u32,
    ) -> Self;
}

pub struct FixedSizeBinaryEncoder {
    bytes_encoder: Box<dyn ArrayEncoder>,
    byte_width: usize,
}

impl FixedSizeBinaryEncoder {
    pub fn new(bytes_encoder: Box<dyn ArrayEncoder>, byte_width: usize) -> Self;
}

Import

use lance_encoding::previous::encodings::physical::fixed_size_binary::{
    FixedSizeBinaryPageScheduler, FixedSizeBinaryEncoder,
};

I/O Contract

Input Type Description
bytes_scheduler Box<dyn PageScheduler> Scheduler for the raw bytes buffer
byte_width u32 Fixed number of bytes per value
bytes_per_offset u32 4 for Binary/Utf8, 8 for LargeBinary/LargeUtf8
data DataBlock Variable-width data block to encode
Output Type Description
decoded DataBlock::VariableWidth Variable-width block with synthetic offsets and raw bytes
encoded EncodedArray Fixed-width bytes with encoding descriptor

Usage Examples

use lance_encoding::previous::encodings::physical::fixed_size_binary::{
    FixedSizeBinaryPageScheduler, FixedSizeBinaryEncoder,
};
use lance_encoding::decoder::PageScheduler;

// Create scheduler for 16-byte fixed-size binary values
let bytes_scheduler: Box<dyn PageScheduler> = /* from inner encoding */;
let scheduler = FixedSizeBinaryPageScheduler::new(
    bytes_scheduler,
    16,    // byte_width: each value is 16 bytes
    4,     // bytes_per_offset: 32-bit offsets for Binary/Utf8
);

// Create encoder for fixed-size binary
let bytes_encoder: Box<dyn ArrayEncoder> = /* inner value encoder */;
let encoder = FixedSizeBinaryEncoder::new(bytes_encoder, 16);

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment