Implementation:Turboderp org Exllamav2 Ext Hadamard
| Knowledge Sources | |
|---|---|
| Domains | Linear_Algebra, Quantization, C_Extension |
| Last Updated | 2026-02-15 00:00 GMT |
Overview
C++ extension implementing Hadamard matrix construction via the Paley method, generating FP16 tensors filled with +1/-1 values for use in quantization-aware transformations.
Description
ext_hadamard.cpp provides two functions for constructing Hadamard matrices using Paley's quadratic residue method:
had_paley(h) constructs a Type-I Paley Hadamard matrix of order n (where n-1 is an odd prime). The algorithm:
- Sets the first row to all +1 values.
- For rows 1..n-1, sets column 0 to -1 (HALF_N).
- For diagonal elements (i==j), sets +1.
- For off-diagonal elements, computes the modular difference (i-j) mod p and tests if it is a quadratic residue using is_quadratic_residue. Residues map to +1, non-residues to -1.
had_paley2(h) constructs a Type-II Paley Hadamard matrix of order 2(p+1) using a block structure. The matrix is filled as a 2x2 block arrangement where each block is (n/2) x (n/2). Diagonal blocks use paired +1/-1 values (HALF_PN, HALF_NN), while off-diagonal blocks use quadratic residue tests on the prime p = n/2 - 1. The function writes pairs of FP16 values simultaneously using uint32_t pointers for efficiency.
Helper functions include:
- pmod(a, b) -- Positive modulo ensuring non-negative results.
- modular_pow(base, exp, mod) -- Efficient modular exponentiation via repeated squaring.
- is_quadratic_residue(a, p) -- Tests whether a is a quadratic residue modulo p using Euler's criterion.
FP16 constants are defined as raw bit patterns: HALF_P (0x3C00 = +1.0), HALF_N (0xBC00 = -1.0), and combined pairs for Type-II construction.
Usage
These functions are called during the quantization process to generate Hadamard matrices used in rotation-based weight transformations. Hadamard matrices help distribute quantization error more uniformly across weight dimensions, improving the quality of low-bit quantized models.
Code Reference
Source Location
- Repository: Turboderp_org_Exllamav2
- File: exllamav2/exllamav2_ext/ext_hadamard.cpp
- Lines: 1-118
Signature
void had_paley(torch::Tensor h);
void had_paley2(torch::Tensor h);
// Helpers
inline int pmod(int a, int b);
inline int modular_pow(int base, int exp, int mod);
inline bool is_quadratic_residue(int a, int p);
Import
# Called via the compiled C++ extension (pybind11)
from exllamav2 import exllamav2_ext as ext_c
ext_c.had_paley(h_tensor)
ext_c.had_paley2(h_tensor)
I/O Contract
| Function | Parameter | Type | Direction | Description |
|---|---|---|---|---|
| had_paley | h | torch.Tensor (FP16) | in/out | Square tensor of shape (n, n) where n-1 is an odd prime; filled in-place with +1/-1 values |
| had_paley2 | h | torch.Tensor (FP16) | in/out | Square tensor of shape (n, n) where n = 2(p+1) for odd prime p; filled in-place with +1/-1 block structure |
| Constraint | Description |
|---|---|
| Dtype | Input tensor must be kHalf (FP16) |
| Shape | Tensor must be square (dim 0 == dim 1) |
| Contiguity | Tensor must be contiguous in memory (had_paley only) |
Usage Examples
import torch
from exllamav2 import exllamav2_ext as ext_c
# Create a 12x12 Hadamard matrix (11 is prime)
h = torch.zeros((12, 12), dtype=torch.float16, device="cpu")
ext_c.had_paley(h)
# h now contains +1.0 and -1.0 values
# Create a Type-II Paley Hadamard matrix of order 28
# p = 13 (prime), n = 2*(13+1) = 28
h2 = torch.zeros((28, 28), dtype=torch.float16, device="cpu")
ext_c.had_paley2(h2)
Related Pages
- Turboderp_org_Exllamav2_FPx_Quantization -- Sub-byte FPx quantization utilities
- Turboderp_org_Exllamav2_Ext_Norm -- GPU normalization operations used alongside quantization