Implementation:Mlc ai Mlc llm Param Mapping
Overview
The Param Mapping module (python/mlc_llm/loader/mapping.py) defines the core data structures used to map parameter names between external model formats (such as HuggingFace PyTorch or safetensors) and MLC LLM's internal model definitions. It provides two primary dataclasses: ExternMapping for converting source parameters to MLC parameters, and QuantizeMapping for handling parameter name and value transformations during quantization.
Location
- File:
python/mlc_llm/loader/mapping.py - Lines: 102
- Module:
mlc_llm.loader.mapping
Key Components
Type Alias: MapFuncVariadic
A union type that defines the allowable signatures for mapping functions. These are callables that accept zero to four np.ndarray arguments and return a single np.ndarray. This type captures the variadic nature of parameter combination functions (e.g., a function that merges Q, K, V projections takes three arrays).
MapFuncVariadic = Union[
Callable[[], np.ndarray],
Callable[[np.ndarray], np.ndarray],
Callable[[np.ndarray, np.ndarray], np.ndarray],
Callable[[np.ndarray, np.ndarray, np.ndarray], np.ndarray],
Callable[[np.ndarray, np.ndarray, np.ndarray, np.ndarray], np.ndarray],
]
ExternMapping
A dataclass that encapsulates the mapping from MLC LLM parameter names to their source parameter names in an external format such as HuggingFace PyTorch.
Fields:
| Field | Type | Description |
|---|---|---|
param_map |
Dict[str, List[str]] |
Maps each MLC parameter name to a list of source parameter names. For example, the fused qkv_proj.weight maps to separate q_proj.weight, k_proj.weight, and v_proj.weight.
|
map_func |
Dict[str, MapFuncVariadic] |
Maps each MLC parameter name to a callable that combines the source parameters into the MLC parameter. For instance, concatenating Q, K, V weight matrices along axis 0. |
unused_params |
Set[str] |
A set of source parameter names that are present in the external weights but are not used by the MLC LLM model (e.g., rotary_emb.inv_freq).
|
Methods:
def add_mapping(
self,
map_from: str,
map_to: List[str],
func: MapFuncVariadic,
) -> None:
"""Add a mapping from MLC parameters to source parameters as well as a mapping function."""
self.param_map[map_from] = map_to
self.map_func[map_from] = func
def add_unused(self, name: str):
"""Add a parameter name in the source parameters to the set of unused parameters."""
self.unused_params.add(name)
add_mappingregisters a new parameter correspondence. Themap_fromis the MLC-side name,map_tois the list of source-side names, andfuncis the transformation function.add_unusedrecords a source parameter name that should be ignored during loading.
QuantizeMapping
A dataclass that encapsulates the mapping from a pre-quantization MLC parameter to its post-quantization names and values. This is used when quantization transforms a single weight into multiple artifacts (e.g., a quantized weight plus a scale tensor).
Fields:
| Field | Type | Description |
|---|---|---|
param_map |
Dict[str, List[str]] |
Maps a parameter name to its post-quantization destination names (e.g., qkv_proj.weight to qkv_proj.weight_quantized and qkv_proj.weight_scale).
|
map_func |
Dict[str, Callable[[Tensor], List[Tensor]]] |
Maps a parameter name to a function that splits the MLC parameter into the destination quantized parameters. |
The docstring documents two distinct use cases:
- Case A (On-the-fly quantization): Both
ExternMappingandQuantizeMappingare used together. Raw fp16/bf16/fp32 weights from HuggingFace are quantized as they are loaded into RAM. - Case B (Pre-quantized weights): A pass over
nn.Moduleconverts parameters from non-quantized to quantized form first, and then onlyExternMappingis used to map the already-quantized parameters.
Exports
__all__ = ["ExternMapping", "QuantizeMapping"]
The module publicly exports both ExternMapping and QuantizeMapping, which are consumed by all model-specific loader modules across the codebase.
Usage Pattern
Every model-specific loader (e.g., Llama, Mistral, Gemma3) follows a consistent pattern:
- Instantiate the model and export its named parameters via TVM.
- Create an
ExternMappinginstance. - Iterate over layers, calling
add_mappingfor parameters that require fusion (e.g., separate Q/K/V projections fused into a single QKV projection). - Call
add_unusedfor source parameters not needed by the MLC model. - Fall through remaining parameters with identity mappings.
Dependencies
dataclasses-- standard library for the dataclass decoratornumpy-- used as the array type in mapping functionstvm.runtime.Tensor-- used in quantize mapping function signatures