Implementation:Turboderp org Exllamav2 Ext Hadamard

Knowledge Sources	Turboderp_org_Exllamav2
Domains	Linear_Algebra, Quantization, C_Extension
Last Updated	2026-02-15 00:00 GMT

Overview

C++ extension implementing Hadamard matrix construction via the Paley method, generating FP16 tensors filled with +1/-1 values for use in quantization-aware transformations.

Description

ext_hadamard.cpp provides two functions for constructing Hadamard matrices using Paley's quadratic residue method:

had_paley(h) constructs a Type-I Paley Hadamard matrix of order n (where n-1 is an odd prime). The algorithm:

Sets the first row to all +1 values.
For rows 1..n-1, sets column 0 to -1 (HALF_N).
For diagonal elements (i==j), sets +1.
For off-diagonal elements, computes the modular difference (i-j) mod p and tests if it is a quadratic residue using is_quadratic_residue. Residues map to +1, non-residues to -1.

had_paley2(h) constructs a Type-II Paley Hadamard matrix of order 2(p+1) using a block structure. The matrix is filled as a 2x2 block arrangement where each block is (n/2) x (n/2). Diagonal blocks use paired +1/-1 values (HALF_PN, HALF_NN), while off-diagonal blocks use quadratic residue tests on the prime p = n/2 - 1. The function writes pairs of FP16 values simultaneously using uint32_t pointers for efficiency.

Helper functions include:

pmod(a, b) -- Positive modulo ensuring non-negative results.
modular_pow(base, exp, mod) -- Efficient modular exponentiation via repeated squaring.
is_quadratic_residue(a, p) -- Tests whether a is a quadratic residue modulo p using Euler's criterion.

FP16 constants are defined as raw bit patterns: HALF_P (0x3C00 = +1.0), HALF_N (0xBC00 = -1.0), and combined pairs for Type-II construction.

Usage

These functions are called during the quantization process to generate Hadamard matrices used in rotation-based weight transformations. Hadamard matrices help distribute quantization error more uniformly across weight dimensions, improving the quality of low-bit quantized models.

Code Reference

Source Location

Repository: Turboderp_org_Exllamav2
File: exllamav2/exllamav2_ext/ext_hadamard.cpp
Lines: 1-118

Signature

void had_paley(torch::Tensor h);

void had_paley2(torch::Tensor h);

// Helpers
inline int pmod(int a, int b);
inline int modular_pow(int base, int exp, int mod);
inline bool is_quadratic_residue(int a, int p);

Import

# Called via the compiled C++ extension (pybind11)
from exllamav2 import exllamav2_ext as ext_c
ext_c.had_paley(h_tensor)
ext_c.had_paley2(h_tensor)

I/O Contract

Function	Parameter	Type	Direction	Description
had_paley	h	torch.Tensor (FP16)	in/out	Square tensor of shape (n, n) where n-1 is an odd prime; filled in-place with +1/-1 values
had_paley2	h	torch.Tensor (FP16)	in/out	Square tensor of shape (n, n) where n = 2(p+1) for odd prime p; filled in-place with +1/-1 block structure

Constraint	Description
Dtype	Input tensor must be kHalf (FP16)
Shape	Tensor must be square (dim 0 == dim 1)
Contiguity	Tensor must be contiguous in memory (had_paley only)

Usage Examples

import torch
from exllamav2 import exllamav2_ext as ext_c

# Create a 12x12 Hadamard matrix (11 is prime)
h = torch.zeros((12, 12), dtype=torch.float16, device="cpu")
ext_c.had_paley(h)
# h now contains +1.0 and -1.0 values

# Create a Type-II Paley Hadamard matrix of order 28
# p = 13 (prime), n = 2*(13+1) = 28
h2 = torch.zeros((28, 28), dtype=torch.float16, device="cpu")
ext_c.had_paley2(h2)

Related Pages

Turboderp_org_Exllamav2_FPx_Quantization -- Sub-byte FPx quantization utilities
Turboderp_org_Exllamav2_Ext_Norm -- GPU normalization operations used alongside quantization

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment