Environment:Huggingface Peft Optional Quantization Backends
| Knowledge Sources | |
|---|---|
| Domains | Infrastructure, Quantization |
| Last Updated | 2026-02-07 06:44 GMT |
Overview
Optional quantization backend packages (GPTQModel, TorchAO, AQLM, EETQ, HQQ, INC) that enable PEFT adapters on models quantized with various methods beyond bitsandbytes.
Description
PEFT supports attaching adapter layers (LoRA, OFT, etc.) to models quantized by several different quantization frameworks. Each backend is detected at import time via `importlib.util.find_spec()` and, when available, registers specialized layer classes. Some backends have strict minimum version requirements that are enforced at import time. All backends are optional and independent of each other.
Usage
Use these backends when:
- You have a GPTQ-quantized model and want to add LoRA adapters (GPTQModel + optimum)
- You want to use PyTorch Architecture Optimization quantization (TorchAO)
- You have AQLM, EETQ, HQQ, or INC quantized models
System Requirements
| Category | Requirement | Notes |
|---|---|---|
| Hardware | GPU with CUDA support | Most quantization backends require CUDA |
| Python | >= 3.10 | Same as core PEFT |
Dependencies
GPTQModel
- `gptqmodel` >= 5.6.12
- `optimum` >= 1.24.0
TorchAO
- `torchao` >= 0.4.0
AQLM
- `aqlm` (any version)
EETQ
- `eetq` (any version)
HQQ
- `hqq` (any version)
INC (Intel Neural Compressor)
- `neural_compressor` (any version)
Diffusers (for DreamBooth workflows)
- `diffusers` (any version)
Credentials
No additional credentials required.
Quick Install
# GPTQModel quantization support
pip install gptqmodel>=5.6.12 optimum>=1.24.0
# TorchAO quantization support
pip install torchao>=0.4.0
# AQLM quantization support
pip install aqlm
# EETQ quantization support
pip install eetq
# HQQ quantization support
pip install hqq
# Intel Neural Compressor support
pip install neural-compressor
# Diffusers for DreamBooth LoRA
pip install diffusers
Code Evidence
GPTQModel version enforcement from `src/peft/import_utils.py:39-62`:
@lru_cache
def is_gptqmodel_available():
if importlib.util.find_spec("gptqmodel") is not None:
GPTQMODEL_MINIMUM_VERSION = packaging.version.parse("5.6.12")
OPTIMUM_MINIMUM_VERSION = packaging.version.parse("1.24.0")
version_gptqmodel = packaging.version.parse(
importlib_metadata.version("gptqmodel")
)
if GPTQMODEL_MINIMUM_VERSION <= version_gptqmodel:
if is_optimum_available():
version_optimum = packaging.version.parse(
importlib_metadata.version("optimum")
)
if OPTIMUM_MINIMUM_VERSION <= version_optimum:
return True
else:
raise ImportError(
f"gptqmodel requires optimum version "
f"`{OPTIMUM_MINIMUM_VERSION}` or higher."
)
TorchAO version enforcement from `src/peft/import_utils.py:108-128`:
@lru_cache
def is_torchao_available():
if importlib.util.find_spec("torchao") is None:
return False
TORCHAO_MINIMUM_VERSION = packaging.version.parse("0.4.0")
try:
torchao_version = packaging.version.parse(
importlib_metadata.version("torchao")
)
except importlib_metadata.PackageNotFoundError:
return False
if torchao_version < TORCHAO_MINIMUM_VERSION:
raise ImportError(
f"Found an incompatible version of torchao. "
f"Found version {torchao_version}, "
f"but only versions above {TORCHAO_MINIMUM_VERSION} are supported"
)
return True
Quantization method detection from `src/peft/utils/other.py:150-154`:
is_gptq_quantized = getattr(model, "quantization_method", None) == "gptq"
is_aqlm_quantized = getattr(model, "quantization_method", None) == "aqlm"
is_eetq_quantized = getattr(model, "quantization_method", None) == "eetq"
is_torchao_quantized = getattr(model, "quantization_method", None) == "torchao"
is_hqq_quantized = getattr(model, "quantization_method", None) == "hqq" or getattr(
model, "hqq_quantized", False
)
Common Errors
| Error Message | Cause | Solution |
|---|---|---|
| `Found an incompatible version of gptqmodel` | GPTQModel < 5.6.12 | `pip install gptqmodel>=5.6.12` |
| `gptqmodel requires optimum version 1.24.0 or higher` | Optimum too old for GPTQModel | `pip install optimum>=1.24.0` |
| `gptqmodel requires optimum ... to be installed` | Optimum not installed | `pip install optimum>=1.24.0` |
| `Found an incompatible version of torchao` | TorchAO < 0.4.0 | `pip install torchao>=0.4.0` |
Compatibility Notes
- GPTQModel requires both `gptqmodel` AND `optimum` packages with specific minimum versions.
- TorchAO has an edge case where `find_spec("torchao")` returns non-None but `importlib_metadata.version("torchao")` raises `PackageNotFoundError`. PEFT handles this gracefully.
- HQQ detection is unique: it checks both `quantization_method == "hqq"` and `getattr(model, "hqq_quantized", False)` for backward compatibility.
- All backends register their own adapter layer subclasses in tuner-specific `bnb.py`, `gptq.py`, `aqlm.py`, `eetq.py`, `hqq.py`, `inc.py`, or `torchao.py` files.