Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Environment:Bitsandbytes foundation Bitsandbytes HPU Gaudi Runtime

From Leeroopedia


Knowledge Sources
Domains Infrastructure, HPU_Backend, Dequantization
Last Updated 2026-02-07 14:00 GMT

Overview

Habana Gaudi HPU runtime environment for running bitsandbytes NF4 dequantization on Intel Gaudi accelerators.

Description

This environment provides the HPU (Habana Processing Unit) accelerated context for running bitsandbytes 4-bit NF4 dequantization on Intel Gaudi hardware. It requires the Habana software stack (habana_frameworks) with the habana-torch-plugin package. The backend uses the native torch.ops.hpu.dequantize_nf4 operation provided by the Habana PyTorch integration. Backward compatibility handling exists for Gaudi software versions prior to 1.22, which use a different 4-bit compression format.

Usage

Use this environment for 4-bit NF4 dequantization on Habana Gaudi accelerators. The HPU backend is automatically detected when habana_frameworks is importable and torch.hpu is available. Currently only NF4 quantization type is supported on HPU.

System Requirements

Category Requirement Notes
Hardware Intel Gaudi / Gaudi2 / Gaudi3 accelerator Habana-designed AI accelerator
OS Linux (Ubuntu recommended) Primary supported platform
Gaudi Software >= 1.21 Version 1.22+ recommended for current compression format
Python >= 3.10 From pyproject.toml
PyTorch >= 2.3, < 3 Must have Habana PyTorch plugin

Dependencies

System Packages

  • Habana software stack (habana_frameworks Python package)
  • habana-torch-plugin (detected via pip list)
  • Gaudi driver and runtime

Python Packages

  • `torch` >= 2.3, < 3
  • `habana_frameworks`
  • `habana_frameworks.torch`
  • `numpy` >= 1.17
  • `packaging` >= 20.9

Credentials

No secrets or credentials required. The Gaudi software stack is detected via standard Python imports.

Quick Install

# Install with Habana software stack (follow Habana documentation)
# Then install bitsandbytes:
pip install bitsandbytes

# Verify HPU detection
python -c "import torch; print(torch.hpu.is_available())"
python -m bitsandbytes

Code Evidence

Gaudi SW version detection from `bitsandbytes/backends/utils.py`:

def get_gaudi_sw_version():
    output = subprocess.run(
        "pip list | grep habana-torch-plugin",
        shell=True,
        text=True,
        capture_output=True,
    )

Backward compatibility check from `bitsandbytes/backends/hpu/ops.py`:

# Version check for compression format compatibility
if GAUDI_SW_VER.major < 1 or GAUDI_SW_VER.minor < 22:
    # Use reversed compression format for older Gaudi SW

HPU kernel dispatch from `bitsandbytes/backends/hpu/ops.py`:

@register_kernel("bitsandbytes::dequantize_4bit", "hpu")
def _(A, absmax, blocksize, quant_type, shape, dtype):
    # Delegates to torch.ops.hpu.dequantize_nf4

Common Errors

Error Message Cause Solution
`habana_frameworks` not found Habana software stack not installed Install the full Habana software stack per Habana documentation
`torch.hpu` not available PyTorch not compiled with Habana support Install the Habana-compatible PyTorch build
NF4 quantization only FP4 quant_type passed on HPU Only NF4 is supported on HPU; use quant_type="nf4"

Compatibility Notes

  • Quantization types: Only NF4 is supported on HPU. FP4 is not available.
  • Storage formats: Supports both uint8 and bfloat16 quant_storage.
  • Gaudi SW < 1.22: Uses reversed 4-bit compression format; handled automatically by the backend.
  • Gaudi SW >= 1.22: Uses current compression format with direct dequantize_nf4 dispatch.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment