Implementation:Mlc ai Mlc llm Param Mapping

Overview

The Param Mapping module (python/mlc_llm/loader/mapping.py) defines the core data structures used to map parameter names between external model formats (such as HuggingFace PyTorch or safetensors) and MLC LLM's internal model definitions. It provides two primary dataclasses: ExternMapping for converting source parameters to MLC parameters, and QuantizeMapping for handling parameter name and value transformations during quantization.

Location

File: python/mlc_llm/loader/mapping.py
Lines: 102
Module: mlc_llm.loader.mapping

Key Components

Type Alias: MapFuncVariadic

A union type that defines the allowable signatures for mapping functions. These are callables that accept zero to four np.ndarray arguments and return a single np.ndarray. This type captures the variadic nature of parameter combination functions (e.g., a function that merges Q, K, V projections takes three arrays).

MapFuncVariadic = Union[
    Callable[[], np.ndarray],
    Callable[[np.ndarray], np.ndarray],
    Callable[[np.ndarray, np.ndarray], np.ndarray],
    Callable[[np.ndarray, np.ndarray, np.ndarray], np.ndarray],
    Callable[[np.ndarray, np.ndarray, np.ndarray, np.ndarray], np.ndarray],
]

ExternMapping

A dataclass that encapsulates the mapping from MLC LLM parameter names to their source parameter names in an external format such as HuggingFace PyTorch.

Fields:

Field	Type	Description
`param_map`	`Dict[str, List[str]]`	Maps each MLC parameter name to a list of source parameter names. For example, the fused `qkv_proj.weight` maps to separate `q_proj.weight`, `k_proj.weight`, and `v_proj.weight`.
`map_func`	`Dict[str, MapFuncVariadic]`	Maps each MLC parameter name to a callable that combines the source parameters into the MLC parameter. For instance, concatenating Q, K, V weight matrices along axis 0.
`unused_params`	`Set[str]`	A set of source parameter names that are present in the external weights but are not used by the MLC LLM model (e.g., `rotary_emb.inv_freq`).

Methods:

def add_mapping(
    self,
    map_from: str,
    map_to: List[str],
    func: MapFuncVariadic,
) -> None:
    """Add a mapping from MLC parameters to source parameters as well as a mapping function."""
    self.param_map[map_from] = map_to
    self.map_func[map_from] = func

def add_unused(self, name: str):
    """Add a parameter name in the source parameters to the set of unused parameters."""
    self.unused_params.add(name)

add_mapping registers a new parameter correspondence. The map_from is the MLC-side name, map_to is the list of source-side names, and func is the transformation function.
add_unused records a source parameter name that should be ignored during loading.

QuantizeMapping

A dataclass that encapsulates the mapping from a pre-quantization MLC parameter to its post-quantization names and values. This is used when quantization transforms a single weight into multiple artifacts (e.g., a quantized weight plus a scale tensor).

Fields:

Field	Type	Description
`param_map`	`Dict[str, List[str]]`	Maps a parameter name to its post-quantization destination names (e.g., `qkv_proj.weight` to `qkv_proj.weight_quantized` and `qkv_proj.weight_scale`).
`map_func`	`Dict[str, Callable[[Tensor], List[Tensor]]]`	Maps a parameter name to a function that splits the MLC parameter into the destination quantized parameters.

The docstring documents two distinct use cases:

Case A (On-the-fly quantization): Both ExternMapping and QuantizeMapping are used together. Raw fp16/bf16/fp32 weights from HuggingFace are quantized as they are loaded into RAM.
Case B (Pre-quantized weights): A pass over nn.Module converts parameters from non-quantized to quantized form first, and then only ExternMapping is used to map the already-quantized parameters.

Exports

__all__ = ["ExternMapping", "QuantizeMapping"]

The module publicly exports both ExternMapping and QuantizeMapping, which are consumed by all model-specific loader modules across the codebase.

Usage Pattern

Every model-specific loader (e.g., Llama, Mistral, Gemma3) follows a consistent pattern:

Instantiate the model and export its named parameters via TVM.
Create an ExternMapping instance.
Iterate over layers, calling add_mapping for parameters that require fusion (e.g., separate Q/K/V projections fused into a single QKV projection).
Call add_unused for source parameters not needed by the MLC model.
Fall through remaining parameters with identity mappings.

Dependencies

dataclasses -- standard library for the dataclass decorator
numpy -- used as the array type in mapping functions
tvm.runtime.Tensor -- used in quantize mapping function signatures

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment