Implementation:NVIDIA TransformerEngine Debug Disable Quant Layer

Field	Value
Sources	TransformerEngine
Domains	Deep_Learning, PyTorch, Debug, Quantization
Last Updated	2026-02-07 14:00 GMT

Overview

Debug feature that disables all quantized GEMMs in an entire layer, forcing full high-precision execution.

Description

DisableQuantizationLayer is a layer-level debug feature that disables all quantized GEMM operations for the selected layers. Unlike DisableQuantizationGEMM which targets specific GEMMs, this feature affects all GEMMs (fprop, dgrad, wgrad) in the layer. It does not inherit from TEConfigAPIMapper because it does not need GEMM/tensor-level routing -- it applies to the entire layer. Its parse_config_and_api simply checks the enabled field.

Usage

Enable via YAML config, selecting layers by type or regex pattern. All quantized operations in the matched layers will run in high precision.

Code Reference

Source Location

Repository: NVIDIA/TransformerEngine
File: transformer_engine/debug/features/disable_quantization_layer.py
Lines: 1--61

Signature

@Registry.register_feature(namespace="transformer_engine")
class DisableQuantizationLayer:
    def fp8_gemm_enabled(self, config, layer_name, gemm, iteration) -> Tuple[bool, int]: ...
    def parse_config_and_api(self, config, **_kwargs) -> Tuple[bool, None]: ...

Import

from transformer_engine.debug.features.disable_quantization_layer import DisableQuantizationLayer

I/O Contract

Inputs

Name	Type	Required	Description
config	Dict	Yes	Must contain `enabled: True`
layer_name	str	Yes	Name of the TE layer
gemm	str	Yes	One of `fprop`, `dgrad`, `wgrad`
iteration	int	Yes	Current training step

Outputs

Name	Type	Description
result	Tuple[bool, int]	Returns `(False, iteration + 1)` to disable quantization for all GEMMs

Usage Examples

# YAML configuration:
# example_disable_quantization_layer:
#   enabled: True
#   layers:
#     layer_types: [fc1]
#   transformer_engine:
#     DisableQuantizationLayer:
#       enabled: True

Related Pages

Environment:NVIDIA_TransformerEngine_CUDA_Toolkit_Requirements

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment