Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:NVIDIA TransformerEngine Common Header

From Leeroopedia
Revision as of 15:57, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/NVIDIA_TransformerEngine_Common_Header.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Field Value
Sources TransformerEngine
Domains Deep_Learning, Optimization
Last Updated 2026-02-07 14:00 GMT

Overview

Core internal header defining the fundamental data structures and type system for TransformerEngine's C++ layer, including SimpleTensor, Tensor, GroupedTensor, and dtype/scaling mode utilities.

Description

common.h is the most fundamental header in TransformerEngine's common library, included by virtually every other C++ source file. It defines:

  • SimpleTensor: Lightweight wrapper around a data pointer, shape vector, and dtype with numel(), has_data(), and buffer_size_bytes() methods.
  • Tensor: Extends SimpleTensor with quantization-related fields (amax, scale, scale_inv, columnwise variants), scaling mode tracking, and GEMM-swizzled scales flag.
  • Scaling mode helpers: Inline functions is_tensor_scaling, is_block_scaling, is_delayed_tensor_scaling, is_mxfp8_scaling, is_nvfp4_scaling for checking scaling modes.
  • Type traits: Template metaprogramming via TypeInfo, TypeExtrema, is_fp8, is_fp4 for compile-time type information on CUDA numeric types (FP16, BF16, FP8 E4M3/E5M2, FP4).
  • Utility functions: product() for shape products, get_buffer_size_bytes() for memory calculations.

Usage

Include this header in any C++ file that works with TransformerEngine tensor representations. It is the foundation header upon which all other TE common headers depend.

Code Reference

Source Location

Repository
NVIDIA/TransformerEngine
File
transformer_engine/common/common.h
Lines
1--912

Signature

namespace transformer_engine {

struct SimpleTensor {
  void *dptr;
  std::vector<size_t> shape;
  DType dtype;
  size_t numel() const;
  bool has_data() const;
  size_t buffer_size_bytes() const;
};

struct Tensor {
  SimpleTensor data;
  SimpleTensor columnwise_data;
  SimpleTensor amax, columnwise_amax;
  SimpleTensor scale, scale_inv, columnwise_scale_inv;
  NVTEScalingMode scaling_mode;
  bool with_gemm_swizzled_scales = false;
};

inline bool is_tensor_scaling(const NVTEScalingMode &mode);
inline bool is_block_scaling(const NVTEScalingMode &mode);
inline bool is_mxfp8_scaling(const NVTEScalingMode &mode);

}  // namespace transformer_engine

Import

#include "common/common.h"

I/O Contract

Inputs

Name Type Required Description
N/A N/A N/A This is a header file defining types and utilities

Outputs

Name Type Description
N/A N/A Provides fundamental type definitions used throughout the library

Usage Examples

#include "common/common.h"

using namespace transformer_engine;

// Create a simple tensor
SimpleTensor t(data_ptr, {batch_size, hidden_size}, DType::kFloat16);

// Check scaling mode
if (is_mxfp8_scaling(tensor.scaling_mode)) {
    // Handle MXFP8 block scaling
}

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment