Implementation:Mit han lab Llm awq Auto clip block

Overview

Concrete tool for finding optimal weight clipping values within a transformer block provided by the llm-awq library.

File: awq/quantize/auto_clip.py, Lines: 67-83

@torch.no_grad()
def auto_clip_block(module, w_bit, q_config, input_feat):

from awq.quantize.auto_clip import auto_clip_block

Inputs:

Output:

Internally calls auto_clip_layer() for each linear layer, skipping layers whose names contain q_, k_, query, key, or Wqkv.

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment