Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Turboderp org Exllamav2 Ext Hadamard

From Leeroopedia
Knowledge Sources
Domains Linear_Algebra, Quantization, C_Extension
Last Updated 2026-02-15 00:00 GMT

Overview

C++ extension implementing Hadamard matrix construction via the Paley method, generating FP16 tensors filled with +1/-1 values for use in quantization-aware transformations.

Description

ext_hadamard.cpp provides two functions for constructing Hadamard matrices using Paley's quadratic residue method:

had_paley(h) constructs a Type-I Paley Hadamard matrix of order n (where n-1 is an odd prime). The algorithm:

  1. Sets the first row to all +1 values.
  2. For rows 1..n-1, sets column 0 to -1 (HALF_N).
  3. For diagonal elements (i==j), sets +1.
  4. For off-diagonal elements, computes the modular difference (i-j) mod p and tests if it is a quadratic residue using is_quadratic_residue. Residues map to +1, non-residues to -1.

had_paley2(h) constructs a Type-II Paley Hadamard matrix of order 2(p+1) using a block structure. The matrix is filled as a 2x2 block arrangement where each block is (n/2) x (n/2). Diagonal blocks use paired +1/-1 values (HALF_PN, HALF_NN), while off-diagonal blocks use quadratic residue tests on the prime p = n/2 - 1. The function writes pairs of FP16 values simultaneously using uint32_t pointers for efficiency.

Helper functions include:

  • pmod(a, b) -- Positive modulo ensuring non-negative results.
  • modular_pow(base, exp, mod) -- Efficient modular exponentiation via repeated squaring.
  • is_quadratic_residue(a, p) -- Tests whether a is a quadratic residue modulo p using Euler's criterion.

FP16 constants are defined as raw bit patterns: HALF_P (0x3C00 = +1.0), HALF_N (0xBC00 = -1.0), and combined pairs for Type-II construction.

Usage

These functions are called during the quantization process to generate Hadamard matrices used in rotation-based weight transformations. Hadamard matrices help distribute quantization error more uniformly across weight dimensions, improving the quality of low-bit quantized models.

Code Reference

Source Location

Signature

void had_paley(torch::Tensor h);

void had_paley2(torch::Tensor h);

// Helpers
inline int pmod(int a, int b);
inline int modular_pow(int base, int exp, int mod);
inline bool is_quadratic_residue(int a, int p);

Import

# Called via the compiled C++ extension (pybind11)
from exllamav2 import exllamav2_ext as ext_c
ext_c.had_paley(h_tensor)
ext_c.had_paley2(h_tensor)

I/O Contract

Function Parameter Type Direction Description
had_paley h torch.Tensor (FP16) in/out Square tensor of shape (n, n) where n-1 is an odd prime; filled in-place with +1/-1 values
had_paley2 h torch.Tensor (FP16) in/out Square tensor of shape (n, n) where n = 2(p+1) for odd prime p; filled in-place with +1/-1 block structure
Constraint Description
Dtype Input tensor must be kHalf (FP16)
Shape Tensor must be square (dim 0 == dim 1)
Contiguity Tensor must be contiguous in memory (had_paley only)

Usage Examples

import torch
from exllamav2 import exllamav2_ext as ext_c

# Create a 12x12 Hadamard matrix (11 is prime)
h = torch.zeros((12, 12), dtype=torch.float16, device="cpu")
ext_c.had_paley(h)
# h now contains +1.0 and -1.0 values

# Create a Type-II Paley Hadamard matrix of order 28
# p = 13 (prime), n = 2*(13+1) = 28
h2 = torch.zeros((28, 28), dtype=torch.float16, device="cpu")
ext_c.had_paley2(h2)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment