Implementation:NVIDIA TransformerEngine Ops Sequential
| Field | Value |
|---|---|
| Sources | TransformerEngine |
| Domains | Deep_Learning, PyTorch, Optimization |
| Last Updated | 2026-02-07 14:00 GMT |
Overview
Drop-in replacement for torch.nn.Sequential with support for automatic operation fusion across groups of FusibleOperation modules.
Description
Sequential is a container that groups consecutive FusibleOperation modules into OperationFuser groups. These groups can then apply forward and backward fusion patterns (e.g., ForwardLinearBiasActivation, BackwardLinearAdd) to reduce kernel launches and memory traffic. Non-fusible torch.nn.Module modules are treated as opaque blocks between fusible groups. The container supports standard Python sequence operations (__getitem__, __setitem__, __delitem__, append, extend, insert, pop).
Usage
Use as a drop-in replacement for torch.nn.Sequential when building models with FusibleOperation modules that benefit from automatic fusion.
Code Reference
Source Location
- Repository
NVIDIA/TransformerEngine- File
transformer_engine/pytorch/ops/sequential.py- Lines
- 1--198
Signature
class Sequential(torch.nn.Module):
def __init__(self, *args: FusibleOperation | torch.nn.Module) -> None: ...
def forward(self, input: torch.Tensor, *extra_inputs: torch.Tensor) -> torch.Tensor | tuple: ...
def append(self, module) -> Sequential: ...
def extend(self, modules) -> Sequential: ...
def insert(self, idx, module) -> Sequential: ...
def pop(self, idx) -> torch.nn.Module: ...
Import
from transformer_engine.pytorch.ops.sequential import Sequential
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| input | torch.Tensor | Yes | Primary input tensor |
| extra_inputs | torch.Tensor | No | Additional tensors consumed by operations with num_extra_inputs > 0
|
Outputs
| Name | Type | Description |
|---|---|---|
| output | torch.Tensor or tuple | Output tensor, or tuple of (output, *extra_outputs) if any operation produces extra outputs |
Usage Examples
from transformer_engine.pytorch.ops.sequential import Sequential
from transformer_engine.pytorch.ops.basic import RMSNorm, MakeExtraOutput, AddExtraInput
from transformer_engine.pytorch.ops.linear import Linear
model = Sequential(
MakeExtraOutput(), # Branch for residual
RMSNorm(4096),
Linear(4096, 4096),
AddExtraInput(), # Residual add
)
output = model(input_tensor)