Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Lance format Lance LegacyBitpackEncoding

From Leeroopedia


Knowledge Sources
Domains Encoding, Legacy_Format
Last Updated 2026-02-08 19:33 GMT

Overview

The legacy bitpack encoding compresses integer arrays by packing values into fewer bits using the FastLanes algorithm in the Lance v2.0 format.

Description

⚠️ DEPRECATED: This is legacy code from the Lance v1/v2.0 format, retained only for backward compatibility. See Lance_format_Lance_Warning_Deprecated_Legacy_Encodings.

This module implements bitpacking for the legacy (v2.0) Lance file format, gated behind the bitpacking feature flag. It provides two encoder/scheduler pairs: BitpackedForNonNegArrayEncoder / BitpackedForNonNegScheduler for non-negative integers (unsigned types and signed types known to be non-negative), and BitpackedArrayEncoder / BitpackedScheduler for general signed integers. The encoding works by computing the minimum number of bits needed to represent all values in a page (using compute_compressed_bit_width_for_non_neg), then packing values into chunks of 1024 elements using the FastLanes bitpacking algorithm. The last chunk is zero-padded if the input is not a multiple of 1024. Supported data types include Int8/UInt8 through Int64/UInt64. The encoding also handles nullable data blocks by separately encoding validity bitmaps. Decoders unpack the compressed data back to full-width values during read.

Usage

Use this encoding for integer columns where the actual value range is significantly smaller than the full data type range. The CoreArrayEncodingStrategy selects bitpacking when it determines the compressed bit width provides meaningful savings. This requires the bitpacking feature to be enabled at compile time.

Code Reference

Source Location

rust/lance-encoding/src/previous/encodings/physical/bitpack.rs

Signature

pub fn compute_compressed_bit_width_for_non_neg(arrays: &[ArrayRef]) -> u64;

pub struct BitpackedForNonNegArrayEncoder {
    pub compressed_bit_width: usize,
    pub original_data_type: DataType,
}

impl BitpackedForNonNegArrayEncoder {
    pub fn new(compressed_bit_width: usize, data_type: DataType) -> Self;
}

impl ArrayEncoder for BitpackedForNonNegArrayEncoder { /* ... */ }

pub struct BitpackedForNonNegScheduler { /* fields omitted */ }

impl BitpackedForNonNegScheduler {
    pub fn new(
        compressed_bits_per_value: u64,
        uncompressed_bits_per_value: u64,
        buffer_offset: u64,
    ) -> Self;
}

pub struct BitpackedScheduler { /* fields omitted */ }

impl BitpackedScheduler {
    pub fn new(
        compressed_bits_per_value: u64,
        uncompressed_bits_per_value: u64,
        buffer_offset: u64,
        signed: bool,
    ) -> Self;
}

Import

use lance_encoding::previous::encodings::physical::bitpack::{
    compute_compressed_bit_width_for_non_neg,
    BitpackedForNonNegArrayEncoder,
    BitpackedForNonNegScheduler,
    BitpackedScheduler,
};

I/O Contract

Input Type Description
arrays &[ArrayRef] Integer arrays to analyze for bit width computation
data DataBlock Fixed-width or nullable data block to encode
compressed_bits_per_value u64 Target bits per value after compression
uncompressed_bits_per_value u64 Original bits per value of the data type
buffer_offset u64 Position of the bitpacked buffer in the file
Output Type Description
bit_width u64 Computed minimum bits needed to represent all values
encoded EncodedArray Bitpacked data with encoding descriptor
decoded DataBlock Unpacked fixed-width data block

Usage Examples

use lance_encoding::previous::encodings::physical::bitpack::{
    compute_compressed_bit_width_for_non_neg,
    BitpackedForNonNegArrayEncoder,
};
use arrow_array::{ArrayRef, UInt32Array};
use arrow_schema::DataType;
use std::sync::Arc;

// Compute compressed bit width
let array: ArrayRef = Arc::new(UInt32Array::from(vec![0, 5, 10, 15, 20]));
let bit_width = compute_compressed_bit_width_for_non_neg(&[array.clone()]);

// Create encoder with computed bit width
let encoder = BitpackedForNonNegArrayEncoder::new(
    bit_width as usize,
    DataType::UInt32,
);

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment