Implementation:Bitsandbytes foundation Bitsandbytes Backend Quantization Tables
| Knowledge Sources | |
|---|---|
| Domains | Quantization, Backend_Infrastructure |
| Last Updated | 2026-02-07 13:31 GMT |
Overview
Shared backend utilities providing NF4 and FP4 quantization lookup tables, Triton availability detection, and Gaudi software version detection used by CPU and XPU backends.
Description
This module provides infrastructure shared across non-CUDA backends. The NF4 quantization table contains 16 float values derived from normal distribution quantiles (as described in the QLoRA paper), mapping 4-bit indices to their dequantized float values. The FP4 quantization table contains 16 float values representing the FP4 data type mapping. These tables are placed on XPU if available, otherwise CPU. The module also provides get_gaudi_sw_version() for detecting Habana Gaudi software versions and a triton_available flag.
Usage
These tables are used by the CPU and XPU backends for 4-bit dequantization lookup. The CODE dictionary provides a unified interface for selecting the quantization table by name ("nf4" or "fp4").
Code Reference
Source Location
- Repository: bitsandbytes
- File: bitsandbytes/backends/utils.py
- Lines: 1-84
Signature
_NF4_QUANT_TABLE = torch.tensor([
-1.0, -0.6962, -0.5251, -0.3949, -0.2844, -0.1848,
-0.0911, 0.0, 0.0796, 0.1609, 0.2461, 0.3379,
0.4407, 0.5626, 0.7230, 1.0,
], dtype=torch.float32)
_FP4_QUANT_TABLE = torch.tensor([...], dtype=torch.float32)
CODE = {"nf4": _NF4_QUANT_TABLE, "fp4": _FP4_QUANT_TABLE}
def get_gaudi_sw_version() -> Optional[version.Version]:
"""Returns installed Gaudi SW version or None."""
GAUDI_SW_VER = get_gaudi_sw_version()
Import
from bitsandbytes.backends.utils import CODE, GAUDI_SW_VER, triton_available
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| quant_type | str | Yes | Key into CODE dict: "nf4" or "fp4" |
Outputs
| Name | Type | Description |
|---|---|---|
| CODE[quant_type] | torch.Tensor | 16-element float32 lookup table mapping 4-bit codes to float values |
| GAUDI_SW_VER | Optional[Version] | Installed Habana Gaudi SW version or None |
| triton_available | bool | Whether Triton is importable |
Usage Examples
Using Quantization Tables
from bitsandbytes.backends.utils import CODE
# Get the NF4 lookup table
nf4_table = CODE["nf4"]
# Dequantize a 4-bit code: index 15 -> 1.0, index 0 -> -1.0
dequantized_val = nf4_table[15] # 1.0