Implementation:NVIDIA DALI EfficientNet Common Blocks

Knowledge Sources	NVIDIA_DALI
Domains	Vision, Training
Last Updated	2026-02-08 16:00 GMT

Overview

Provides reusable PyTorch neural network building blocks for the EfficientNet model architecture, including convolutions, squeeze-and-excitation, stochastic depth, and exponential moving average.

Description

This module defines the foundational building blocks used by the EfficientNet model implementation. The central LayerBuilder class provides a factory pattern for constructing convolution layers (1x1, 3x3, 5x5, 7x7, and depthwise separable), batch normalization layers with configurable momentum and epsilon, and activation functions (ReLU, SiLU, ONNX-compatible SiLU). It uses a dataclass Config to store builder-wide settings and applies Kaiming Normal weight initialization to all convolutions.

The module includes two implementations of the Squeeze-and-Excitation (SE) attention mechanism: SqueezeAndExcitation using linear layers for the standard implementation, and SqueezeAndExcitationTRT using 1x1 convolutions with adaptive average pooling for TensorRT compatibility. Both have "Sequential" variants (SequentialSqueezeAndExcitation and SequentialSqueezeAndExcitationTRT) that include the residual multiplication and optional quantization support via pytorch_quantization.

Additional components include StochasticDepthResidual for regularization via random layer dropping during training, EMA (Exponential Moving Average) for maintaining smoothed model weights, ONNXSiLU as a ONNX-exportable alternative to torch.nn.SiLU, LambdaLayer for wrapping arbitrary functions as modules, and Flatten for spatial dimension removal.

Usage

Use this module as the building block library when constructing EfficientNet models for the PyTorch training example with DALI data loading. The blocks are imported by the efficientnet.py module and composed into the full network architecture.

Code Reference

Source Location

Repository: NVIDIA_DALI
File: docs/examples/use_cases/pytorch/efficientnet/image_classification/models/common.py
Lines: 1-302

Signature

class LayerBuilder(object):
    @dataclass
    class Config:
        activation: str = "relu"
        conv_init: str = "fan_in"
        bn_momentum: Optional[float] = None
        bn_epsilon: Optional[float] = None

    def __init__(self, config: "LayerBuilder.Config"): ...
    def conv(self, kernel_size, in_planes, out_planes, groups=1, stride=1,
             bn=False, zero_init_bn=False, act=False): ...
    def conv1x1(self, in_planes, out_planes, stride=1, groups=1, bn=False, act=False): ...
    def conv3x3(self, in_planes, out_planes, stride=1, groups=1, bn=False, act=False): ...
    def convDepSep(self, kernel_size, in_planes, out_planes, stride=1, bn=False, act=False): ...
    def batchnorm(self, planes, zero_init=False): ...
    def activation(self): ...

class SqueezeAndExcitation(nn.Module): ...
class SqueezeAndExcitationTRT(nn.Module): ...
class SequentialSqueezeAndExcitation(SqueezeAndExcitation): ...
class SequentialSqueezeAndExcitationTRT(SqueezeAndExcitationTRT): ...
class StochasticDepthResidual(nn.Module): ...
class EMA: ...
class ONNXSiLU(nn.Module): ...
class LambdaLayer(nn.Module): ...
class Flatten(nn.Module): ...

Import

from .common import (
    SequentialSqueezeAndExcitation,
    SequentialSqueezeAndExcitationTRT,
    LayerBuilder,
    StochasticDepthResidual,
    Flatten,
)

I/O Contract

Inputs (LayerBuilder.conv)

Name	Type	Required	Description
kernel_size	int	Yes	Size of the convolutional kernel.
in_planes	int	Yes	Number of input channels.
out_planes	int	Yes	Number of output channels.
groups	int	No	Number of convolution groups. Default: 1.
stride	int	No	Convolution stride. Default: 1.
bn	bool	No	Whether to append batch normalization. Default: False.
zero_init_bn	bool	No	Whether to zero-initialize BN gamma. Default: False.
act	bool	No	Whether to append activation. Default: False.

Outputs (LayerBuilder.conv)

Name	Type	Description
layer	nn.Module	A Conv2d module, or a Sequential module with conv, optional BN, and optional activation.

Inputs (SqueezeAndExcitation)

Name	Type	Required	Description
x	torch.Tensor	Yes	Input feature map of shape [N, C, H, W].

Outputs (SequentialSqueezeAndExcitation)

Name	Type	Description
out	torch.Tensor	Channel-reweighted feature map of same shape as input.

Usage Examples

Building convolution layers

from image_classification.models.common import LayerBuilder

config = LayerBuilder.Config(activation="silu", conv_init="fan_in",
                              bn_momentum=0.01, bn_epsilon=1e-3)
builder = LayerBuilder(config)

# 3x3 conv with BN and activation
conv_bn_act = builder.conv3x3(64, 128, stride=2, bn=True, act=True)

# Depthwise separable conv
dw_conv = builder.convDepSep(5, 128, 128, stride=1, bn=True, act=True)

# 1x1 pointwise conv with BN only
pw_conv = builder.conv1x1(128, 256, bn=True)

Using EMA for model weight smoothing

import copy
from image_classification.models.common import EMA

model_ema = copy.deepcopy(model)
ema = EMA(mu=0.999, module_ema=model_ema)

for step, batch in enumerate(train_loader):
    loss = train_step(model, batch)
    ema(model, step=step)

Related Pages

Environment:NVIDIA_DALI_CUDA_GPU_Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment