Implementation:Lance format Lance ValueEncoding
| Knowledge Sources | |
|---|---|
| Domains | Encoding, Compression |
| Last Updated | 2026-02-08 19:33 GMT |
Overview
ValueEncoder is the default physical encoding that writes fixed-width data as-is without compression, chunking it into mini-block format with power-of-2 sized chunks targeting approximately 4 KiB each.
Description
The value encoder is the simplest and most fundamental physical encoding. It takes fixed-width data and divides it into mini-block chunks without applying any compression. This serves as both the baseline encoding and the building block for more complex encodings.
Key implementation details:
- Chunks target approximately 4 KiB, using the largest power-of-2 chunk size that fits within
MAX_MINIBLOCK_BYTES(8186 bytes). - Handles sub-byte data types (booleans, FSL<boolean>) by calculating chunk boundaries at 8-value word boundaries to avoid splitting bytes.
- Supports nested FSL (FixedSizeList) data with inner validity buffers, producing multiple buffers per chunk (data + validity per FSL layer).
- The last chunk may contain fewer values than the power-of-2 target.
- Also implements
PerValueCompressorfor the full-zip path andBlockCompressorfor the block path.
The ValueDecompressor handles the inverse: it reads chunks and reconstructs FixedWidthDataBlock or NullableDataBlock (for FSL with validity).
Usage
The value encoder is used as the default for all fixed-width data when no specialized compression (bitpacking, RLE, BSS) is selected. It is also used as the inner encoder within packed struct encoding and as the baseline comparison for compression effectiveness.
Code Reference
| Source Location | Repository: lance-format/lance, File: rust/lance-encoding/src/encodings/physical/value.rs, Lines: 1-1209
|
|---|---|
| Signature |
#[derive(Debug, Default)]
pub struct ValueEncoder {}
impl MiniBlockCompressor for ValueEncoder {
fn compress(&self, page: DataBlock) -> Result<(MiniBlockCompressed, CompressiveEncoding)>;
}
impl PerValueCompressor for ValueEncoder {
fn compress(&self, data: DataBlock) -> Result<(PerValueDataBlock, CompressiveEncoding)>;
}
impl BlockCompressor for ValueEncoder {
fn compress(&self, data: DataBlock) -> Result<(DataBlock, CompressiveEncoding)>;
}
#[derive(Debug)]
pub struct ValueDecompressor {
// Handles flat, FSL, and nullable data decompression
}
|
| Import | use lance_encoding::encodings::physical::value::{ValueEncoder, ValueDecompressor};
|
I/O Contract
| Direction | Type | Description |
|---|---|---|
| Input | DataBlock::FixedWidth |
Fixed-width data block (integers, floats, booleans, FSL) |
| Input | DataBlock::FixedSizeList |
FSL data with optional inner validity |
| Output | MiniBlockCompressed |
Uncompressed data divided into ~4KiB chunks |
| Output | CompressiveEncoding |
Flat encoding description with bits_per_value |
| Output (decompress) | DataBlock::FixedWidth |
Reconstructed fixed-width data block |
Usage Examples
use lance_encoding::encodings::physical::value::ValueEncoder;
use lance_encoding::encodings::logical::primitive::miniblock::MiniBlockCompressor;
let encoder = ValueEncoder::default();
let (compressed, encoding) = encoder.compress(data_block)?;
// Data is chunked but not compressed
Related Pages
- Lance_format_Lance_BitpackingEncoding - Compressed alternative for integer data
- Lance_format_Lance_RleEncoding - Compressed alternative for repeated values
- Lance_format_Lance_PackedEncoding - Uses ValueEncoder for interleaved struct data
- Lance_format_Lance_GeneralCompressor - Can add LZ4/Zstd compression on top
- Lance_format_Lance_MiniBlockCompressor - Trait implemented by ValueEncoder