Principle:Huggingface Optimum GPTQ Quantizer Configuration

Overview

Configuration schema for GPTQ post-training quantization that defines bit-width, group size, dampening, and quantization strategy parameters.

Description

GPTQ (Generative Pre-trained Transformer Quantization) reduces model weights to low-bit representations (2-8 bits) using Hessian-based calibration. The quantizer configuration defines the key parameters that control the quantization process:

Bit-width (bits) controls the precision/compression trade-off. Supported values are 2, 3, 4, and 8 bits.
Group size (group_size) determines how many weights share quantization parameters. The default is 128; setting it to -1 enables per-column quantization.
Dampening percent (damp_percent) stabilizes the Hessian inverse computation. The recommended default is 0.1.
Activation ordering (desc_act) can improve quantization quality by quantizing columns in order of decreasing activation magnitude, at the cost of slower inference.
Symmetric quantization (sym) toggles between symmetric and asymmetric quantization modes. Asymmetric quantization requires gptqmodel.
Weight format (format) selects between gptq (v1) and gptq_v2 formats. The v2 format is used internally by gptqmodel for asymmetric support.
True sequential (true_sequential) enables layer-wise quantization within a single Transformer block, so each layer is quantized using inputs that have passed through previously quantized layers.

The configuration is validated at initialization: bits must be in [2, 3, 4, 8], group_size must be greater than 0 or equal to -1, and damp_percent must be strictly between 0 and 1.

Usage

Use when setting up GPTQ quantization for any large language model to reduce memory footprint. The configuration parameters are passed to the GPTQQuantizer constructor and determine all aspects of the quantization behavior.

from optimum.gptq import GPTQQuantizer

quantizer = GPTQQuantizer(
    bits=4,
    dataset="wikitext2",
    group_size=128,
    damp_percent=0.1,
    desc_act=False,
    sym=True,
    true_sequential=True,
    format="gptq",
)

Theoretical Basis

GPTQ is based on the OBQ (Optimal Brain Quantization) framework. For each weight column w, GPTQ solves:

argmin_q (w - q)^T H (w - q)

where H is the Hessian of the layer loss with respect to weights. The dampening parameter adds λI to H for numerical stability, where λ = damp_percent × mean(diag(H)). Group-wise quantization applies separate scale and zero-point parameters per group_size consecutive weights, allowing finer-grained quantization at the cost of additional storage for the quantization parameters.

The desc_act option (also known as act-order) reorders columns by decreasing activation magnitude before quantization. This ensures that the most important weights (those multiplied by the largest activations) are quantized first, when the accumulated error is smallest.

Configuration Parameters

Parameter	Type	Default	Description
`bits`	`int`	(required)	Number of bits for quantization. Must be 2, 3, 4, or 8.
`dataset`	`str` or `List[str]`	`None`	Calibration dataset name or list of strings.
`group_size`	`int`	`128`	Number of weights sharing quantization parameters. -1 for per-column.
`damp_percent`	`float`	`0.1`	Dampening as fraction of average Hessian diagonal.
`desc_act`	`bool`	`False`	Quantize columns in decreasing activation order.
`act_group_aware`	`bool`	`True`	Use GAR (group aware activation order). Only when `desc_act=False`.
`sym`	`bool`	`True`	Use symmetric quantization.
`true_sequential`	`bool`	`True`	Layer-wise quantization within blocks.
`format`	`str`	`"gptq"`	Weight format: `gptq` (v1) or `gptq_v2`.
`backend`	`str`	`None`	GPTQ inference kernel backend selection.

Metadata

Key	Value
source Paper	GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers
source Repo	optimum
domains	Quantization, NLP, Optimization

Connections

Implementation:Huggingface_Optimum_GPTQQuantizer_Init

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment