Principle:LLMBook zh LLMBook zh github io Symmetric Quantization
| Knowledge Sources | |
|---|---|
| Domains | Deep_Learning, Model_Compression, Inference |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
A model compression technique that maps floating-point weight values to lower-precision integers using a linear scale and zero point.
Description
Symmetric Quantization reduces model memory footprint by representing float32 weights as lower-precision integers (typically int8). The technique uses two parameters: a scale factor (S) and a zero point (Z) to map between the float and integer domains. The quantized values are clamped to the representable range of the target bit width.
Dequantization reverses the process to approximate the original float values, introducing a small quantization error.
Usage
Use this principle when you need to reduce model size for deployment on memory-constrained devices. 8-bit quantization typically preserves model quality with minimal degradation while halving memory usage.
Theoretical Basis
Given a float range and integer range :
Scale factor:
Zero point:
Quantization:
Dequantization: