Principle:VainF Torch Pruning BN Scale Regularization
Overview
Network Slimming via L1 regularization on batch normalization scaling factors to induce channel-level sparsity.
Description
Batch normalization layers have a learnable scaling factor for each channel. By adding L1 regularization on these values during training, channels with small can be identified as unimportant and removed. This approach is elegant because BN layers already exist in most modern CNN architectures, requiring no additional parameters or architectural changes.
Key aspects of BN Scale Regularization:
- Leverages existing BN parameters: No new learnable parameters are needed; the method repurposes the BatchNorm scaling factor as an importance indicator.
- L1 sparsity penalty: The sign-based gradient update
grad += reg * sign(weight)drives small scaling factors toward exactly zero. - Extended to group level: In Torch-Pruning, BN scales are aggregated across dependency groups to ensure structural consistency when pruning coupled layers.
- Group lasso variant: An optional group lasso mode replaces the L1 penalty with an L2-based group penalty, regularizing BN weights proportionally to the inverse of the group L2 norm:
grad += reg * (1 / ||group||_2) * weight.
Usage
Use for CNNs with BatchNorm layers when you want a simple, well-studied regularization approach. This is the classic Network Slimming method.
Recommended scenarios:
- Architectures that already contain
BatchNorm1d,BatchNorm2d, orBatchNorm3dlayers. - When a lightweight, easy-to-implement sparsity-inducing technique is desired.
- When group-level consistency is needed, enable the
group_lassovariant to regularize across dependency groups.
Theoretical Basis
The total loss with BN scale regularization is:
The gradient modification for standard L1 mode:
Channels with resulting small values are pruned.
For the group lasso variant, the gradient modification becomes:
This encourages entire groups to shrink together, producing a cleaner structural sparsity pattern that respects layer dependencies.