Implementation:NVIDIA TransformerEngine Debug Disable Quant GEMM

Field	Value
Sources	TransformerEngine
Domains	Deep_Learning, PyTorch, Debug, Quantization
Last Updated	2026-02-07 14:00 GMT

Overview

Debug feature that disables quantization for specific GEMM operations (fprop, dgrad, wgrad), forcing them to execute in high precision.

Description

DisableQuantizationGEMM is a debug feature that selectively disables quantized GEMM execution for specified operations. When enabled for a GEMM, it forces that operation to run in high precision instead of FP8/NVFP4. It inherits from TEConfigAPIMapper for config-based GEMM/tensor routing, and overrides the fp8_gemm_enabled API to return False for the matched GEMMs.

Usage

Enable via YAML config, specifying which GEMMs to run in high precision. Useful for debugging numerical issues in specific GEMM operations during quantized training.

Code Reference

Source Location

Repository: NVIDIA/TransformerEngine
File: transformer_engine/debug/features/disable_quantization_gemm.py
Lines: 1--59

Signature

@Registry.register_feature(namespace="transformer_engine")
class DisableQuantizationGEMM(TEConfigAPIMapper):
    def fp8_gemm_enabled(self, config, layer_name, gemm, iteration) -> Tuple[bool, int]: ...

Import

from transformer_engine.debug.features.disable_quantization_gemm import DisableQuantizationGEMM

I/O Contract

Inputs

Name	Type	Required	Description
config	Dict	Yes	Must contain `gemms` list specifying which GEMMs to disable
layer_name	str	Yes	Name of the TE layer
gemm	str	Yes	One of `fprop`, `dgrad`, `wgrad`
iteration	int	Yes	Current training step

Outputs

Name	Type	Description
result	Tuple[bool, int]	Returns `(False, iteration + 1)` to disable quantized GEMM

Usage Examples

# YAML configuration:
# example_disable_quantization_gemm:
#   enabled: True
#   layers:
#     layer_types: [fc1]
#   transformer_engine:
#     DisableQuantizationGEMM:
#       enabled: True
#       gemms: [dgrad, wgrad]

Related Pages

Environment:NVIDIA_TransformerEngine_CUDA_Toolkit_Requirements

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment