Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Lance format Lance FullZipCompressor

From Leeroopedia


Knowledge Sources
Domains Encoding, Compression
Last Updated 2026-02-08 19:33 GMT

Overview

PerValueCompressor is the trait defining the full-zip structural encoding path in Lance 2.1, where compressed buffers are zipped together so all parts of a value are stored contiguously.

Description

Full-zip is one of the two structural encoding strategies in Lance 2.1 (the other being miniblock). In the full-zip approach, data is compressed using per-value compressors that produce either fixed-width or variable-width output. The compressed parts of each value are then stored contiguously, enabling efficient sequential access.

The module defines two key types:

  • PerValueDataBlock: An enum with variants Fixed(FixedWidthDataBlock) and Variable(VariableWidthBlock), representing the two valid output formats for per-value compression.
  • PerValueCompressor trait: Requires implementing a compress method that takes a DataBlock and returns a PerValueDataBlock plus a CompressiveEncoding description. The compression must support random-access decompression -- any value must be decompressible without decompressing preceding values.

Full-zip is most suitable for large data types where the overhead of per-value compression metadata is small relative to the value size.

Usage

Full-zip encoding is selected by setting the STRUCTURAL_ENCODING_META_KEY field metadata to STRUCTURAL_ENCODING_FULLZIP, or it may be selected automatically for certain data types. Implementors include ValueEncoder, VariableEncoder, PackedStructEncoder, and block compression per-value wrappers.

Code Reference

Source Location Repository: lance-format/lance, File: rust/lance-encoding/src/encodings/logical/primitive/fullzip.rs, Lines: 1-54
Signature
#[derive(Debug)]
pub enum PerValueDataBlock {
    Fixed(FixedWidthDataBlock),
    Variable(VariableWidthBlock),
}

impl PerValueDataBlock {
    pub fn data_size(&self) -> u64;
}

pub trait PerValueCompressor: std::fmt::Debug + Send + Sync {
    fn compress(&self, data: DataBlock) -> Result<(PerValueDataBlock, CompressiveEncoding)>;
}
Import use lance_encoding::encodings::logical::primitive::fullzip::{PerValueCompressor, PerValueDataBlock};

I/O Contract

Direction Type Description
Input DataBlock Any data block to compress on a per-value basis
Output PerValueDataBlock Fixed-width or variable-width compressed result
Output CompressiveEncoding Encoding description for decompression

Usage Examples

use lance_encoding::encodings::logical::primitive::fullzip::PerValueCompressor;
use lance_encoding::encodings::physical::value::ValueEncoder;

// ValueEncoder implements PerValueCompressor
let compressor = ValueEncoder::default();
let (compressed_block, encoding) = compressor.compress(data_block)?;

match compressed_block {
    PerValueDataBlock::Fixed(fixed) => { /* fixed-width result */ }
    PerValueDataBlock::Variable(var) => { /* variable-width result */ }
}

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment