Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Mit han lab Llm awq Auto clip block

From Leeroopedia

Overview

Concrete tool for finding optimal weight clipping values within a transformer block provided by the llm-awq library.

Source

File: awq/quantize/auto_clip.py, Lines: 67-83

Signature

@torch.no_grad()
def auto_clip_block(module, w_bit, q_config, input_feat):

Import

from awq.quantize.auto_clip import auto_clip_block

I/O

Inputs:

  • module (nn.Module) - transformer block
  • w_bit (int) - bit width for quantization
  • q_config (dict) - quantization configuration
  • input_feat (dict) - cached activations for each layer

Output:

  • list of (name: str, max_val: torch.Tensor) tuples

Notes

Internally calls auto_clip_layer() for each linear layer, skipping layers whose names contain q_, k_, query, key, or Wqkv.

Related Pages

Knowledge Sources

Domains

  • Quantization
  • Optimization

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment