Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Lance format Lance ValueEncoding

From Leeroopedia
Revision as of 15:29, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Lance_format_Lance_ValueEncoding.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Knowledge Sources
Domains Encoding, Compression
Last Updated 2026-02-08 19:33 GMT

Overview

ValueEncoder is the default physical encoding that writes fixed-width data as-is without compression, chunking it into mini-block format with power-of-2 sized chunks targeting approximately 4 KiB each.

Description

The value encoder is the simplest and most fundamental physical encoding. It takes fixed-width data and divides it into mini-block chunks without applying any compression. This serves as both the baseline encoding and the building block for more complex encodings.

Key implementation details:

  • Chunks target approximately 4 KiB, using the largest power-of-2 chunk size that fits within MAX_MINIBLOCK_BYTES (8186 bytes).
  • Handles sub-byte data types (booleans, FSL<boolean>) by calculating chunk boundaries at 8-value word boundaries to avoid splitting bytes.
  • Supports nested FSL (FixedSizeList) data with inner validity buffers, producing multiple buffers per chunk (data + validity per FSL layer).
  • The last chunk may contain fewer values than the power-of-2 target.
  • Also implements PerValueCompressor for the full-zip path and BlockCompressor for the block path.

The ValueDecompressor handles the inverse: it reads chunks and reconstructs FixedWidthDataBlock or NullableDataBlock (for FSL with validity).

Usage

The value encoder is used as the default for all fixed-width data when no specialized compression (bitpacking, RLE, BSS) is selected. It is also used as the inner encoder within packed struct encoding and as the baseline comparison for compression effectiveness.

Code Reference

Source Location Repository: lance-format/lance, File: rust/lance-encoding/src/encodings/physical/value.rs, Lines: 1-1209
Signature
#[derive(Debug, Default)]
pub struct ValueEncoder {}

impl MiniBlockCompressor for ValueEncoder {
    fn compress(&self, page: DataBlock) -> Result<(MiniBlockCompressed, CompressiveEncoding)>;
}

impl PerValueCompressor for ValueEncoder {
    fn compress(&self, data: DataBlock) -> Result<(PerValueDataBlock, CompressiveEncoding)>;
}

impl BlockCompressor for ValueEncoder {
    fn compress(&self, data: DataBlock) -> Result<(DataBlock, CompressiveEncoding)>;
}

#[derive(Debug)]
pub struct ValueDecompressor {
    // Handles flat, FSL, and nullable data decompression
}
Import use lance_encoding::encodings::physical::value::{ValueEncoder, ValueDecompressor};

I/O Contract

Direction Type Description
Input DataBlock::FixedWidth Fixed-width data block (integers, floats, booleans, FSL)
Input DataBlock::FixedSizeList FSL data with optional inner validity
Output MiniBlockCompressed Uncompressed data divided into ~4KiB chunks
Output CompressiveEncoding Flat encoding description with bits_per_value
Output (decompress) DataBlock::FixedWidth Reconstructed fixed-width data block

Usage Examples

use lance_encoding::encodings::physical::value::ValueEncoder;
use lance_encoding::encodings::logical::primitive::miniblock::MiniBlockCompressor;

let encoder = ValueEncoder::default();
let (compressed, encoding) = encoder.compress(data_block)?;
// Data is chunked but not compressed

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment