Principle:Bitsandbytes foundation Bitsandbytes 4bit Quantization Lookup Tables

Knowledge Sources	QLoRA
Domains	Quantization, Backend_Infrastructure
Last Updated	2026-02-07 13:31 GMT

Overview

Pre-computed lookup tables that map 4-bit quantized codes to their dequantized floating-point values for NF4 and FP4 data types.

Description

4-bit quantization maps continuous floating-point values to one of 16 discrete levels. The mapping from 4-bit code to float value is defined by a lookup table. Two standard tables are used: NF4 (Normal Float 4) uses quantiles of the standard normal distribution, providing information-theoretically optimal representation for normally-distributed weights (as proven in the QLoRA paper). FP4 (Float Point 4) uses a traditional floating-point encoding with sign, exponent, and mantissa bits. These tables are shared across CPU, XPU, and other non-CUDA backends as the authoritative source of dequantization values.

Usage

Apply this principle in any backend that needs to dequantize 4-bit tensors. The lookup table approach provides a simple, hardware-agnostic dequantization mechanism: for each quantized byte, extract the 4-bit indices and index into the table, then multiply by the per-block scaling factor.

Theoretical Basis

For NF4, the 16 quantization levels are the quantiles of a standard normal distribution truncated to [-1, 1]:

$q_{i} = Φ^{- 1} (\frac{i + 0.5}{16}), i \in {0, 1, \dots, 15}$

rescaled so that the minimum is -1.0 and maximum is 1.0.

Dequantization is then:

$x = table [code] \times {absmax}_{block}$

Related Pages

Implementation:Bitsandbytes_foundation_Bitsandbytes_Backend_Quantization_Tables

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment