Implementation:NVIDIA TransformerEngine Debug Disable Quant Layer
| Field | Value |
|---|---|
| Sources | TransformerEngine |
| Domains | Deep_Learning, PyTorch, Debug, Quantization |
| Last Updated | 2026-02-07 14:00 GMT |
Overview
Debug feature that disables all quantized GEMMs in an entire layer, forcing full high-precision execution.
Description
DisableQuantizationLayer is a layer-level debug feature that disables all quantized GEMM operations for the selected layers. Unlike DisableQuantizationGEMM which targets specific GEMMs, this feature affects all GEMMs (fprop, dgrad, wgrad) in the layer. It does not inherit from TEConfigAPIMapper because it does not need GEMM/tensor-level routing -- it applies to the entire layer. Its parse_config_and_api simply checks the enabled field.
Usage
Enable via YAML config, selecting layers by type or regex pattern. All quantized operations in the matched layers will run in high precision.
Code Reference
Source Location
- Repository
NVIDIA/TransformerEngine- File
transformer_engine/debug/features/disable_quantization_layer.py- Lines
- 1--61
Signature
@Registry.register_feature(namespace="transformer_engine")
class DisableQuantizationLayer:
def fp8_gemm_enabled(self, config, layer_name, gemm, iteration) -> Tuple[bool, int]: ...
def parse_config_and_api(self, config, **_kwargs) -> Tuple[bool, None]: ...
Import
from transformer_engine.debug.features.disable_quantization_layer import DisableQuantizationLayer
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| config | Dict | Yes | Must contain enabled: True
|
| layer_name | str | Yes | Name of the TE layer |
| gemm | str | Yes | One of fprop, dgrad, wgrad
|
| iteration | int | Yes | Current training step |
Outputs
| Name | Type | Description |
|---|---|---|
| result | Tuple[bool, int] | Returns (False, iteration + 1) to disable quantization for all GEMMs
|
Usage Examples
# YAML configuration:
# example_disable_quantization_layer:
# enabled: True
# layers:
# layer_types: [fc1]
# transformer_engine:
# DisableQuantizationLayer:
# enabled: True