Implementation:VainF Torch Pruning GroupMagnitudeImportance

Overview

Concrete tool for magnitude-based group importance estimation provided by Torch-Pruning.

Description

GroupMagnitudeImportance computes per-channel importance scores using Lp norms of the weight tensors associated with each channel. It is the primary magnitude-based importance estimator in the Torch-Pruning library, designed to work with the framework's DependencyGraph and Group abstractions for fully automated structural pruning.

The class supports multiple variants of magnitude importance:

Standard L1/L2 norms: Controlled by the p parameter. Setting p=1 yields L1-norm importance; p=2 (the default) yields L2-norm importance.
Batch normalization scaling factor extraction: By restricting target_types to only _BatchNorm layers, the estimator extracts BN gamma values as importance scores, implementing the network slimming approach.
LAMP normalization: Setting normalizer='lamp' enables Layer-Adaptive Magnitude-based Pruning, which uses a cumulative-sum normalization scheme that adapts per-layer sparsity ratios automatically.
Group reduction strategies: The group_reduction parameter controls how importance scores from multiple coupled layers within a dependency group are aggregated into a single per-channel score. Supported strategies include "mean", "sum", "max", "prod", "first", and "gate".

The class operates on Group objects produced by the DependencyGraph. Each Group represents a set of coupled pruning operations that must be executed together to maintain architectural consistency. When called, GroupMagnitudeImportance iterates over every dependency in the group, computes a local importance score for each parameterized layer (Conv, Linear, BatchNorm, LayerNorm), and then reduces and normalizes the scores to produce a single 1-D importance tensor.

Internally, the computation proceeds as follows:

For each layer in the group, the weight slice corresponding to the pruning indices is extracted and flattened.
The element-wise absolute value is raised to the power p, and the result is summed across the non-channel dimensions to yield a local per-channel importance score.
All local scores are collected and aggregated via the chosen group reduction strategy (e.g., scatter-add for "mean"/"sum").
The aggregated scores are normalized according to the chosen normalization scheme.

The class also handles special cases such as transposed convolutions, group convolutions (where importance must be repeated across groups), and layers without affine parameters (which are silently skipped).

Usage

Use GroupMagnitudeImportance when you need a gradient-free importance criterion for structural pruning. It is the recommended default importance estimator for most pruning tasks in Torch-Pruning because:

It does not require a forward or backward pass through the model.
It is fast to compute, even for large models.
It produces reasonable pruning decisions for moderate pruning ratios.

Typical workflow:

Build a DependencyGraph from the model and example inputs.
Obtain a Group by specifying a layer and pruning function.
Instantiate GroupMagnitudeImportance with the desired configuration.
Call the instance with the group to obtain per-channel importance scores.
Use the scores to select which channels to prune.

Code Reference

Source

File: torch_pruning/pruner/importance.py, Lines 58-269

Class Signature

class GroupMagnitudeImportance(Importance):
    def __init__(self,
                 p: int = 2,
                 group_reduction: str = "mean",
                 normalizer: str = 'mean',
                 bias: bool = False,
                 target_types: list = [nn.modules.conv._ConvNd, nn.Linear,
                                       nn.modules.batchnorm._BatchNorm, nn.LayerNorm]):

Import

from torch_pruning.pruner.importance import GroupMagnitudeImportance

or equivalently:

import torch_pruning as tp
tp.importance.GroupMagnitudeImportance

I/O Contract

Constructor Parameters

Parameter	Type	Required	Default	Description
`p`	int	No	2	Norm degree for importance calculation. Use 1 for L1-norm, 2 for L2-norm, etc.
`group_reduction`	str	No	"mean"	Reduction strategy for aggregating importance across coupled layers in a group. Options: `"mean"`, `"sum"`, `"max"`, `"prod"`, `"first"`, `"gate"`.
`normalizer`	str	No	"mean"	Normalization scheme applied after group reduction. Options: `"mean"`, `"sum"`, `"standarization"`, `"max"`, `"gaussian"`, `"lamp"`.
`bias`	bool	No	False	Whether to include bias parameters in importance computation.
`target_types`	list	No	[_ConvNd, Linear, _BatchNorm, LayerNorm]	List of layer types to consider when computing importance. Layers not matching any type in this list are skipped.

call Input

Parameter	Type	Required	Description
`group`	Group	Yes	A dependency group obtained from the DependencyGraph, representing a set of coupled pruning operations.

Output

Returns: A 1-D torch.Tensor of per-channel importance scores. The length of the tensor equals the number of channels in the root pruning operation of the group. Returns None if no parameterized layers are found in the group.

Usage Examples

Example 1: Basic Usage with DependencyGraph

import torch
import torch.nn as nn
import torch_pruning as tp

# Build a simple model
model = nn.Sequential(
    nn.Conv2d(3, 64, 3, padding=1),
    nn.BatchNorm2d(64),
    nn.ReLU(),
    nn.Conv2d(64, 128, 3, padding=1),
)

# Build the dependency graph
DG = tp.DependencyGraph().build_dependency(
    model, example_inputs=torch.randn(1, 3, 224, 224)
)

# Get a pruning group for output channels of the first conv layer
group = DG.get_pruning_group(
    model[0], tp.prune_conv_out_channels, idxs=[0, 1, 2, 3]
)

# Compute importance using default L2-norm
imp = tp.importance.GroupMagnitudeImportance(p=2)
scores = imp(group)
# scores is a 1-D tensor of length 4, one score per channel
print(scores)

Example 2: L1-Norm Variant

import torch_pruning as tp

# L1-norm importance with no normalization, using only the first layer
imp = tp.importance.GroupMagnitudeImportance(
    p=1,
    normalizer=None,
    group_reduction="first"
)
scores = imp(group)

Example 3: BN Scaling Factor Variant

import torch.nn as nn
import torch_pruning as tp

# Use only BatchNorm scaling factors as importance
imp = tp.importance.GroupMagnitudeImportance(
    p=1,
    normalizer=None,
    group_reduction="mean",
    target_types=[nn.modules.batchnorm._BatchNorm]
)
scores = imp(group)

This variant is equivalent to the BNScaleImportance class, which implements the Network Slimming approach from Liu et al., 2017.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment