Heuristic:VainF Torch Pruning Over Pruning Prevention

Knowledge Sources	Torch-Pruning
Domains	Deep_Learning, Model_Compression, Debugging
Last Updated	2026-02-08 12:00 GMT

Overview

Built-in safeguards prevent layers from being pruned below a minimum channel threshold or to a single channel, which would collapse the network.

Description

Structural pruning can potentially remove so many channels from a layer that the network becomes degenerate or crashes. Torch-Pruning implements two safety guards in BasePruner:

max_pruning_ratio guard: Before pruning a group, the pruner checks whether the target layer's current channel count has already dropped below initial_channels * (1 - max_pruning_ratio). If so, the group is skipped entirely.
Single-channel guard: The pruner never prunes a layer down to 1 channel (layer_out_ch == 1), regardless of the importance scores.

Additionally, when iterative_steps is used, the pruner caps execution at the specified step count and emits a warning if the user tries to prune beyond that.

Usage

Use this heuristic to understand why some layers are skipped during pruning. If you observe that the pruner is not reaching your target pruning ratio, it may be because over-pruning guards are being triggered. You can:

Increase max_pruning_ratio (default is 1.0, meaning no per-layer cap)
Use global_pruning=True to redistribute pruning across layers
Use isomorphic pruning to avoid over-pruning specific structural blocks

The Insight (Rule of Thumb)

Action: Configure max_pruning_ratio to control the maximum fraction of channels any single layer can lose. Set iterative_steps to spread pruning over multiple rounds.
Value: Default max_pruning_ratio=1.0 (no per-layer limit); the single-channel guard is always active. iterative_steps=1 by default.
Trade-off: Setting max_pruning_ratio too low prevents aggressive pruning of unimportant layers; setting it too high risks degenerate layers with very few channels.

Reasoning

A layer with 0 channels cannot process data, and a layer with 1 channel produces a scalar feature map that often breaks downstream operations like batch normalization or grouped convolutions. The single-channel guard avoids this universally. The max_pruning_ratio guard provides a configurable ceiling that can be tightened for safety or loosened for aggressive compression.

Isomorphic pruning (ECCV 2024) provides an additional safeguard by grouping structurally identical layers and pruning them uniformly, preventing the global ranking mode from over-pruning specific layers while under-pruning others.

Code Evidence

Over-pruning check from torch_pruning/pruner/algorithms/base_pruner.py:357-374:

if self.DG.is_out_channel_pruning_fn(pruning_fn):
    layer_out_ch = self.DG.get_out_channels(module)
    if layer_out_ch is None:
        continue
    if layer_out_ch < self.layer_init_out_ch[module] * (
        1 - self.max_pruning_ratio
    ) or layer_out_ch == 1:
        return False

elif self.DG.is_in_channel_pruning_fn(pruning_fn):
    layer_in_ch = self.DG.get_in_channels(module)
    if layer_in_ch is None:
        continue
    if layer_in_ch < self.layer_init_in_ch[module] * (
        1 - self.max_pruning_ratio
    ) or layer_in_ch == 1:
        return False

Iterative step overflow warning from torch_pruning/pruner/algorithms/base_pruner.py:437-440:

if self.current_step > self.iterative_steps:
    warnings.warn(
        "Pruning exceed the maximum iterative steps, no pruning will be performed.")
    return

Linear iterative schedule from torch_pruning/pruner/algorithms/base_pruner.py:149-157:

# The pruner will prune the model iteratively for several steps to achieve the target pruning ratio
# E.g., if iterative_steps=5, pruning_ratio=0.5, the pruning ratio of each step will be [0.1, 0.2, 0.3, 0.4, 0.5]
self.iterative_steps = iterative_steps
self.iterative_pruning_ratio_scheduler = iterative_pruning_ratio_scheduler
self.current_step = 0
self.per_step_pruning_ratio = self.iterative_pruning_ratio_scheduler(
    self.pruning_ratio, self.iterative_steps
)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment