Implementation:Lance format Lance LegacyFixedSizeBinaryEncoding
| Knowledge Sources | |
|---|---|
| Domains | Encoding, Legacy_Format |
| Last Updated | 2026-02-08 19:33 GMT |
Overview
The legacy fixed-size binary encoding stores fixed-width binary and string data by treating each value as a uniform-length byte sequence in the Lance v2.0 format.
Description
⚠️ DEPRECATED: This is legacy code from the Lance v1/v2.0 format, retained only for backward compatibility. See Lance_format_Lance_Warning_Deprecated_Legacy_Encodings.
This module implements fixed-size binary encoding for the legacy (v2.0) Lance file format. FixedSizeBinaryPageScheduler expands row ranges into byte ranges based on a known byte_width, delegates to an inner bytes scheduler, and constructs a FixedSizeBinaryDecoder that generates synthetic offsets during decode. The decoder produces VariableWidthBlock output with computed offsets (either 32-bit or 64-bit depending on the data type), making the output compatible with Arrow's binary and UTF-8 array layouts. The FixedSizeBinaryEncoder converts variable-width input to fixed-width representation by stripping offsets and encoding the raw bytes with a nested ArrayEncoder. This encoding is used as an optimization when all binary or string values in a page have the same length.
Usage
Use this encoding when binary or string data has a uniform byte width per value. The CoreArrayEncodingStrategy selects this encoding for FixedSizeBinary types, and the physical dispatch creates FixedSizeBinaryPageScheduler when encountering a FixedSizeBinary protobuf encoding. It supports both regular and large binary/UTF-8 types via configurable bytes_per_offset (4 or 8).
Code Reference
Source Location
rust/lance-encoding/src/previous/encodings/physical/fixed_size_binary.rs
Signature
pub struct FixedSizeBinaryPageScheduler {
bytes_scheduler: Box<dyn PageScheduler>,
byte_width: u32,
bytes_per_offset: u32,
}
impl FixedSizeBinaryPageScheduler {
pub fn new(
bytes_scheduler: Box<dyn PageScheduler>,
byte_width: u32,
bytes_per_offset: u32,
) -> Self;
}
pub struct FixedSizeBinaryEncoder {
bytes_encoder: Box<dyn ArrayEncoder>,
byte_width: usize,
}
impl FixedSizeBinaryEncoder {
pub fn new(bytes_encoder: Box<dyn ArrayEncoder>, byte_width: usize) -> Self;
}
Import
use lance_encoding::previous::encodings::physical::fixed_size_binary::{
FixedSizeBinaryPageScheduler, FixedSizeBinaryEncoder,
};
I/O Contract
| Input | Type | Description |
|---|---|---|
| bytes_scheduler | Box<dyn PageScheduler> |
Scheduler for the raw bytes buffer |
| byte_width | u32 |
Fixed number of bytes per value |
| bytes_per_offset | u32 |
4 for Binary/Utf8, 8 for LargeBinary/LargeUtf8 |
| data | DataBlock |
Variable-width data block to encode |
| Output | Type | Description |
|---|---|---|
| decoded | DataBlock::VariableWidth |
Variable-width block with synthetic offsets and raw bytes |
| encoded | EncodedArray |
Fixed-width bytes with encoding descriptor |
Usage Examples
use lance_encoding::previous::encodings::physical::fixed_size_binary::{
FixedSizeBinaryPageScheduler, FixedSizeBinaryEncoder,
};
use lance_encoding::decoder::PageScheduler;
// Create scheduler for 16-byte fixed-size binary values
let bytes_scheduler: Box<dyn PageScheduler> = /* from inner encoding */;
let scheduler = FixedSizeBinaryPageScheduler::new(
bytes_scheduler,
16, // byte_width: each value is 16 bytes
4, // bytes_per_offset: 32-bit offsets for Binary/Utf8
);
// Create encoder for fixed-size binary
let bytes_encoder: Box<dyn ArrayEncoder> = /* inner value encoder */;
let encoder = FixedSizeBinaryEncoder::new(bytes_encoder, 16);
Related Pages
- Lance_format_Lance_LegacyPhysicalDispatch - Creates FixedSizeBinaryPageScheduler from protobuf
- Lance_format_Lance_LegacyBinaryEncoding - Alternative encoding for variable-length binary
- Lance_format_Lance_LegacyValueEncoding - Inner encoder for the bytes buffer
- Lance_format_Lance_LegacyEncoder - Strategy that selects fixed-size binary encoding
- Heuristic:Lance_format_Lance_Warning_Deprecated_Legacy_Encodings