Implementation:NVIDIA DALI EfficientNet Common Blocks
| Knowledge Sources | |
|---|---|
| Domains | Vision, Training |
| Last Updated | 2026-02-08 16:00 GMT |
Overview
Provides reusable PyTorch neural network building blocks for the EfficientNet model architecture, including convolutions, squeeze-and-excitation, stochastic depth, and exponential moving average.
Description
This module defines the foundational building blocks used by the EfficientNet model implementation. The central LayerBuilder class provides a factory pattern for constructing convolution layers (1x1, 3x3, 5x5, 7x7, and depthwise separable), batch normalization layers with configurable momentum and epsilon, and activation functions (ReLU, SiLU, ONNX-compatible SiLU). It uses a dataclass Config to store builder-wide settings and applies Kaiming Normal weight initialization to all convolutions.
The module includes two implementations of the Squeeze-and-Excitation (SE) attention mechanism: SqueezeAndExcitation using linear layers for the standard implementation, and SqueezeAndExcitationTRT using 1x1 convolutions with adaptive average pooling for TensorRT compatibility. Both have "Sequential" variants (SequentialSqueezeAndExcitation and SequentialSqueezeAndExcitationTRT) that include the residual multiplication and optional quantization support via pytorch_quantization.
Additional components include StochasticDepthResidual for regularization via random layer dropping during training, EMA (Exponential Moving Average) for maintaining smoothed model weights, ONNXSiLU as a ONNX-exportable alternative to torch.nn.SiLU, LambdaLayer for wrapping arbitrary functions as modules, and Flatten for spatial dimension removal.
Usage
Use this module as the building block library when constructing EfficientNet models for the PyTorch training example with DALI data loading. The blocks are imported by the efficientnet.py module and composed into the full network architecture.
Code Reference
Source Location
- Repository: NVIDIA_DALI
- File: docs/examples/use_cases/pytorch/efficientnet/image_classification/models/common.py
- Lines: 1-302
Signature
class LayerBuilder(object):
@dataclass
class Config:
activation: str = "relu"
conv_init: str = "fan_in"
bn_momentum: Optional[float] = None
bn_epsilon: Optional[float] = None
def __init__(self, config: "LayerBuilder.Config"): ...
def conv(self, kernel_size, in_planes, out_planes, groups=1, stride=1,
bn=False, zero_init_bn=False, act=False): ...
def conv1x1(self, in_planes, out_planes, stride=1, groups=1, bn=False, act=False): ...
def conv3x3(self, in_planes, out_planes, stride=1, groups=1, bn=False, act=False): ...
def convDepSep(self, kernel_size, in_planes, out_planes, stride=1, bn=False, act=False): ...
def batchnorm(self, planes, zero_init=False): ...
def activation(self): ...
class SqueezeAndExcitation(nn.Module): ...
class SqueezeAndExcitationTRT(nn.Module): ...
class SequentialSqueezeAndExcitation(SqueezeAndExcitation): ...
class SequentialSqueezeAndExcitationTRT(SqueezeAndExcitationTRT): ...
class StochasticDepthResidual(nn.Module): ...
class EMA: ...
class ONNXSiLU(nn.Module): ...
class LambdaLayer(nn.Module): ...
class Flatten(nn.Module): ...
Import
from .common import (
SequentialSqueezeAndExcitation,
SequentialSqueezeAndExcitationTRT,
LayerBuilder,
StochasticDepthResidual,
Flatten,
)
I/O Contract
Inputs (LayerBuilder.conv)
| Name | Type | Required | Description |
|---|---|---|---|
| kernel_size | int | Yes | Size of the convolutional kernel. |
| in_planes | int | Yes | Number of input channels. |
| out_planes | int | Yes | Number of output channels. |
| groups | int | No | Number of convolution groups. Default: 1. |
| stride | int | No | Convolution stride. Default: 1. |
| bn | bool | No | Whether to append batch normalization. Default: False. |
| zero_init_bn | bool | No | Whether to zero-initialize BN gamma. Default: False. |
| act | bool | No | Whether to append activation. Default: False. |
Outputs (LayerBuilder.conv)
| Name | Type | Description |
|---|---|---|
| layer | nn.Module | A Conv2d module, or a Sequential module with conv, optional BN, and optional activation. |
Inputs (SqueezeAndExcitation)
| Name | Type | Required | Description |
|---|---|---|---|
| x | torch.Tensor | Yes | Input feature map of shape [N, C, H, W]. |
Outputs (SequentialSqueezeAndExcitation)
| Name | Type | Description |
|---|---|---|
| out | torch.Tensor | Channel-reweighted feature map of same shape as input. |
Usage Examples
Building convolution layers
from image_classification.models.common import LayerBuilder
config = LayerBuilder.Config(activation="silu", conv_init="fan_in",
bn_momentum=0.01, bn_epsilon=1e-3)
builder = LayerBuilder(config)
# 3x3 conv with BN and activation
conv_bn_act = builder.conv3x3(64, 128, stride=2, bn=True, act=True)
# Depthwise separable conv
dw_conv = builder.convDepSep(5, 128, 128, stride=1, bn=True, act=True)
# 1x1 pointwise conv with BN only
pw_conv = builder.conv1x1(128, 256, bn=True)
Using EMA for model weight smoothing
import copy
from image_classification.models.common import EMA
model_ema = copy.deepcopy(model)
ema = EMA(mu=0.999, module_ema=model_ema)
for step, batch in enumerate(train_loader):
loss = train_step(model, batch)
ema(model, step=step)