Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:VainF Torch Pruning LLM Config Update

From Leeroopedia


Metadata

Field Value
Domains NLP, Model_Compression, Pruning
Last Updated 2026-02-08 00:00 GMT

Overview

Post-pruning synchronization of HuggingFace model configuration attributes with the physically modified weight dimensions.

Description

After structural pruning of an LLM, the weight tensors have been physically resized (channels removed), but the model's configuration object (model.config) still contains the original dimensions. If saved as-is, loading the model would fail due to shape mismatches.

The configuration update pattern iterates through all modules to discover the new dimensions from the actual weight shapes and updates model.config accordingly:

  • hidden_size -- the embedding and output projection dimension
  • num_attention_heads -- the number of query attention heads
  • num_key_value_heads -- the number of key/value heads (for Grouped Query Attention)
  • intermediate_size -- the MLP hidden dimension

This step is mandatory between pruning and saving for HuggingFace-compatible models.

Usage

Required after pruning any HuggingFace LLM (Llama, Phi, Qwen, etc.) and before calling model.save_pretrained(). Without this step, the saved model cannot be reloaded.

Theoretical Basis

After pruning, the new configuration values are derived directly from the physical weight shapes:

  • hidden_size: model.lm_head.in_features gives the new hidden_size.
  • num_attention_heads: For attention modules, new num_heads = hidden_size / head_dim. The head_dim is preserved when prune_num_heads=True.
  • intermediate_size:
    • For separate gate/up projections: intermediate_size = gate_proj.out_features
    • For fused gate_up projections: intermediate_size = gate_up_proj.out_features // 2
  • num_key_value_heads (GQA): Updated separately from num_attention_heads, derived from k_proj.out_features // head_dim.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment