Implementation:Ggml org Ggml Cpu powerpc quants

Metadata

Field	Value
Page Type	Implementation (Architecture-Specific SIMD)
Knowledge Sources	GGML
Domains	ML_Infrastructure, Tensor_Computing, SIMD_Optimization
Last Updated	2025-05-15 12:00 GMT

Overview

POWER9 VSX SIMD-optimized quantization, dequantization, and dot product routines for GGML quantized tensor formats on IBM PowerPC processors.

Description

arch/powerpc/quants.c implements PowerPC-specific SIMD acceleration for GGML quantization operations, targeting POWER9 and later processors with the VSX (Vector Scalar Extension) instruction set.

The implementation uses the PowerPC Altivec/VSX intrinsics API with the vector keyword type qualifiers:

vec_xl -- aligned/unaligned vector loads
vec_abs / vec_max -- absolute value and maximum operations
vec_round / vec_cts -- rounding and float-to-integer conversion
vec_pack -- saturation packing from wider to narrower types
vec_xst -- vector store

The quantization pattern (e.g., quantize_row_q8_0) loads eight groups of four floats into vector float registers, computes the block maximum via tree reduction, derives a scale factor, multiplies and rounds to vector signed int, then chains vec_pack calls to progressively narrow from int32 to int16 to int8 before storing.

Precomputed bit-expansion tables (table_b2b_0, table_b2b_1) support efficient unpacking of sub-byte quantized formats. All SIMD paths are guarded by #if defined(__POWER9_VECTOR__) and fall back to scalar reference implementations otherwise.

Usage

This file is compiled as part of the GGML CPU backend when targeting PowerPC platforms with POWER9+ vector support. It enables efficient ML inference on IBM POWER server hardware.

Code Reference

Source Location

GGML repo, file: src/ggml-cpu/arch/powerpc/quants.c (2305 lines).

Key Signatures

void quantize_row_q8_0(const float * GGML_RESTRICT x, void * GGML_RESTRICT vy, int64_t k);
void quantize_row_q8_1(const float * GGML_RESTRICT x, void * GGML_RESTRICT vy, int64_t k);

void ggml_vec_dot_q4_0_q8_0(int n, float * GGML_RESTRICT s, size_t bs,
    const void * GGML_RESTRICT vx, size_t bx,
    const void * GGML_RESTRICT vy, size_t by, int nrc);

Import

#include "ggml-quants.h"
#include "ggml-cpu.h"
#include "simd-mappings.h"

I/O Contract

Inputs (Quantization)

Parameter	Type	Description
`x`	`const float *`	Source array of floating-point values to be quantized.
`k`	`int64_t`	Number of elements to quantize. Must be a multiple of the block size.

Outputs (Quantization)

Output	Type	Description
`vy`	`void *`	Destination buffer for the quantized block data.

Inputs (Dot Product)

Parameter	Type	Description
`n`	`int`	Number of elements in each input vector.
`vx`	`const void *`	Pointer to quantized weight data.
`vy`	`const void *`	Pointer to quantized activation data.
`nrc`	`int`	Number of rows to compute simultaneously.

Outputs (Dot Product)

Output	Type	Description
`s`	`float *`	Destination for the computed dot product result(s).

Usage Examples

// Quantize a row using PowerPC VSX SIMD
float input[256];
block_q8_0 output[256 / QK8_0];

quantize_row_q8_0(input, output, 256);

// Compute quantized dot product on POWER9
float result;
ggml_vec_dot_q4_0_q8_0(256, &result, sizeof(result),
    weight_blocks, sizeof(block_q4_0),
    activation_blocks, sizeof(block_q8_0), 1);

Related Pages

Principle:Ggml_org_Ggml_Architecture_Specific_SIMD_Quantization
Implementation:Ggml_org_Ggml_Cpu_arm_quants -- ARM NEON equivalent
Implementation:Ggml_org_Ggml_Cpu_x86_quants -- x86 SSE/AVX equivalent
Implementation:Ggml_org_Ggml_Cpu_loongarch_quants -- LoongArch LSX equivalent
Implementation:Ggml_org_Ggml_Cpu_riscv_quants -- RISC-V RVV equivalent
Implementation:Ggml_org_Ggml_Cpu_s390_quants -- s390x VXE equivalent
Implementation:Ggml_org_Ggml_Cpu_wasm_quants -- WebAssembly SIMD128 equivalent

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment