Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Bitsandbytes foundation Bitsandbytes Backend Quantization Tables

From Leeroopedia


Knowledge Sources
Domains Quantization, Backend_Infrastructure
Last Updated 2026-02-07 13:31 GMT

Overview

Shared backend utilities providing NF4 and FP4 quantization lookup tables, Triton availability detection, and Gaudi software version detection used by CPU and XPU backends.

Description

This module provides infrastructure shared across non-CUDA backends. The NF4 quantization table contains 16 float values derived from normal distribution quantiles (as described in the QLoRA paper), mapping 4-bit indices to their dequantized float values. The FP4 quantization table contains 16 float values representing the FP4 data type mapping. These tables are placed on XPU if available, otherwise CPU. The module also provides get_gaudi_sw_version() for detecting Habana Gaudi software versions and a triton_available flag.

Usage

These tables are used by the CPU and XPU backends for 4-bit dequantization lookup. The CODE dictionary provides a unified interface for selecting the quantization table by name ("nf4" or "fp4").

Code Reference

Source Location

Signature

_NF4_QUANT_TABLE = torch.tensor([
    -1.0, -0.6962, -0.5251, -0.3949, -0.2844, -0.1848,
    -0.0911, 0.0, 0.0796, 0.1609, 0.2461, 0.3379,
    0.4407, 0.5626, 0.7230, 1.0,
], dtype=torch.float32)

_FP4_QUANT_TABLE = torch.tensor([...], dtype=torch.float32)

CODE = {"nf4": _NF4_QUANT_TABLE, "fp4": _FP4_QUANT_TABLE}

def get_gaudi_sw_version() -> Optional[version.Version]:
    """Returns installed Gaudi SW version or None."""

GAUDI_SW_VER = get_gaudi_sw_version()

Import

from bitsandbytes.backends.utils import CODE, GAUDI_SW_VER, triton_available

I/O Contract

Inputs

Name Type Required Description
quant_type str Yes Key into CODE dict: "nf4" or "fp4"

Outputs

Name Type Description
CODE[quant_type] torch.Tensor 16-element float32 lookup table mapping 4-bit codes to float values
GAUDI_SW_VER Optional[Version] Installed Habana Gaudi SW version or None
triton_available bool Whether Triton is importable

Usage Examples

Using Quantization Tables

from bitsandbytes.backends.utils import CODE

# Get the NF4 lookup table
nf4_table = CODE["nf4"]
# Dequantize a 4-bit code: index 15 -> 1.0, index 0 -> -1.0
dequantized_val = nf4_table[15]  # 1.0

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment