Implementation:Lance format Lance FullZipCompressor
| Knowledge Sources | |
|---|---|
| Domains | Encoding, Compression |
| Last Updated | 2026-02-08 19:33 GMT |
Overview
PerValueCompressor is the trait defining the full-zip structural encoding path in Lance 2.1, where compressed buffers are zipped together so all parts of a value are stored contiguously.
Description
Full-zip is one of the two structural encoding strategies in Lance 2.1 (the other being miniblock). In the full-zip approach, data is compressed using per-value compressors that produce either fixed-width or variable-width output. The compressed parts of each value are then stored contiguously, enabling efficient sequential access.
The module defines two key types:
- PerValueDataBlock: An enum with variants
Fixed(FixedWidthDataBlock)andVariable(VariableWidthBlock), representing the two valid output formats for per-value compression. - PerValueCompressor trait: Requires implementing a
compressmethod that takes aDataBlockand returns aPerValueDataBlockplus aCompressiveEncodingdescription. The compression must support random-access decompression -- any value must be decompressible without decompressing preceding values.
Full-zip is most suitable for large data types where the overhead of per-value compression metadata is small relative to the value size.
Usage
Full-zip encoding is selected by setting the STRUCTURAL_ENCODING_META_KEY field metadata to STRUCTURAL_ENCODING_FULLZIP, or it may be selected automatically for certain data types. Implementors include ValueEncoder, VariableEncoder, PackedStructEncoder, and block compression per-value wrappers.
Code Reference
| Source Location | Repository: lance-format/lance, File: rust/lance-encoding/src/encodings/logical/primitive/fullzip.rs, Lines: 1-54
|
|---|---|
| Signature |
#[derive(Debug)]
pub enum PerValueDataBlock {
Fixed(FixedWidthDataBlock),
Variable(VariableWidthBlock),
}
impl PerValueDataBlock {
pub fn data_size(&self) -> u64;
}
pub trait PerValueCompressor: std::fmt::Debug + Send + Sync {
fn compress(&self, data: DataBlock) -> Result<(PerValueDataBlock, CompressiveEncoding)>;
}
|
| Import | use lance_encoding::encodings::logical::primitive::fullzip::{PerValueCompressor, PerValueDataBlock};
|
I/O Contract
| Direction | Type | Description |
|---|---|---|
| Input | DataBlock |
Any data block to compress on a per-value basis |
| Output | PerValueDataBlock |
Fixed-width or variable-width compressed result |
| Output | CompressiveEncoding |
Encoding description for decompression |
Usage Examples
use lance_encoding::encodings::logical::primitive::fullzip::PerValueCompressor;
use lance_encoding::encodings::physical::value::ValueEncoder;
// ValueEncoder implements PerValueCompressor
let compressor = ValueEncoder::default();
let (compressed_block, encoding) = compressor.compress(data_block)?;
match compressed_block {
PerValueDataBlock::Fixed(fixed) => { /* fixed-width result */ }
PerValueDataBlock::Variable(var) => { /* variable-width result */ }
}
Related Pages
- Lance_format_Lance_MiniBlockCompressor - Alternative structural encoding (miniblock)
- Lance_format_Lance_PrimitiveEncoding - Orchestrates full-zip encoding
- Lance_format_Lance_ValueEncoding - Implements PerValueCompressor
- Lance_format_Lance_BinaryEncoding - Implements PerValueCompressor for variable-width
- Lance_format_Lance_BlockCompression - Per-value block compression wrappers