Implementation:Mit han lab Llm awq Awq config export
Appearance
Overview
Concrete tool for converting AWQ checkpoints to HuggingFace Hub format provided by the llm-awq library (Wrapper Doc type).
Source
examples/convert_to_hf.py, Lines 44-69
Doc Type
This is a Wrapper Doc documenting how the repository uses external HuggingFace APIs (AwqConfig, AutoConfig, HfApi).
Key APIs Used
# Create quantization config
quantization_config = AwqConfig(
bits=args.w_bit,
group_size=args.q_group_size,
zero_point=not args.no_zero_point,
backend="llm-awq",
version="gemv",
)
# Load and patch config
config = AutoConfig.from_pretrained(original_model_path)
config.quantization_config = quantization_config
# Push to hub
config.push_to_hub(quantized_model_hub_path)
tok.push_to_hub(quantized_model_hub_path)
# Upload weights
api.upload_file(
path_or_fileobj=quantized_model_path,
path_in_repo="pytorch_model.bin",
repo_id=quantized_model_hub_path,
repo_type="model",
)
Import
from transformers import AwqConfig, AutoConfig
from huggingface_hub import HfApi
I/O
Inputs:
- original model path (str) - path to the original unquantized model
- quantized model path (str) - path to the AWQ-quantized checkpoint file
- hub repo path (str) - target HuggingFace Hub repository ID
- w_bit (int) - quantization bit width (e.g., 4)
- q_group_size (int) - quantization group size (e.g., 128)
- no_zero_point (bool) - whether to disable zero-point quantization
Outputs:
- HuggingFace Hub repository containing:
- config.json - model configuration with quantization metadata
- Tokenizer files (tokenizer.json, tokenizer_config.json, etc.)
- pytorch_model.bin - the quantized model weights
Related Pages
- Principle:Mit_han_lab_Llm_awq_AWQ_HuggingFace_Export
- Environment:Mit_han_lab_Llm_awq_Python_Runtime_Environment
Knowledge Sources
- Repo|llm-awq|https://github.com/mit-han-lab/llm-awq
- Doc|HuggingFace AWQ|https://huggingface.co/docs/transformers/quantization/awq
Domains
- Deployment
- Model_Distribution
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment