Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Mlc ai Mlc llm Param Mapping

From Leeroopedia


Overview

The Param Mapping module (python/mlc_llm/loader/mapping.py) defines the core data structures used to map parameter names between external model formats (such as HuggingFace PyTorch or safetensors) and MLC LLM's internal model definitions. It provides two primary dataclasses: ExternMapping for converting source parameters to MLC parameters, and QuantizeMapping for handling parameter name and value transformations during quantization.

Location

  • File: python/mlc_llm/loader/mapping.py
  • Lines: 102
  • Module: mlc_llm.loader.mapping

Key Components

Type Alias: MapFuncVariadic

A union type that defines the allowable signatures for mapping functions. These are callables that accept zero to four np.ndarray arguments and return a single np.ndarray. This type captures the variadic nature of parameter combination functions (e.g., a function that merges Q, K, V projections takes three arrays).

MapFuncVariadic = Union[
    Callable[[], np.ndarray],
    Callable[[np.ndarray], np.ndarray],
    Callable[[np.ndarray, np.ndarray], np.ndarray],
    Callable[[np.ndarray, np.ndarray, np.ndarray], np.ndarray],
    Callable[[np.ndarray, np.ndarray, np.ndarray, np.ndarray], np.ndarray],
]

ExternMapping

A dataclass that encapsulates the mapping from MLC LLM parameter names to their source parameter names in an external format such as HuggingFace PyTorch.

Fields:

Field Type Description
param_map Dict[str, List[str]] Maps each MLC parameter name to a list of source parameter names. For example, the fused qkv_proj.weight maps to separate q_proj.weight, k_proj.weight, and v_proj.weight.
map_func Dict[str, MapFuncVariadic] Maps each MLC parameter name to a callable that combines the source parameters into the MLC parameter. For instance, concatenating Q, K, V weight matrices along axis 0.
unused_params Set[str] A set of source parameter names that are present in the external weights but are not used by the MLC LLM model (e.g., rotary_emb.inv_freq).

Methods:

def add_mapping(
    self,
    map_from: str,
    map_to: List[str],
    func: MapFuncVariadic,
) -> None:
    """Add a mapping from MLC parameters to source parameters as well as a mapping function."""
    self.param_map[map_from] = map_to
    self.map_func[map_from] = func

def add_unused(self, name: str):
    """Add a parameter name in the source parameters to the set of unused parameters."""
    self.unused_params.add(name)
  • add_mapping registers a new parameter correspondence. The map_from is the MLC-side name, map_to is the list of source-side names, and func is the transformation function.
  • add_unused records a source parameter name that should be ignored during loading.

QuantizeMapping

A dataclass that encapsulates the mapping from a pre-quantization MLC parameter to its post-quantization names and values. This is used when quantization transforms a single weight into multiple artifacts (e.g., a quantized weight plus a scale tensor).

Fields:

Field Type Description
param_map Dict[str, List[str]] Maps a parameter name to its post-quantization destination names (e.g., qkv_proj.weight to qkv_proj.weight_quantized and qkv_proj.weight_scale).
map_func Dict[str, Callable[[Tensor], List[Tensor]]] Maps a parameter name to a function that splits the MLC parameter into the destination quantized parameters.

The docstring documents two distinct use cases:

  • Case A (On-the-fly quantization): Both ExternMapping and QuantizeMapping are used together. Raw fp16/bf16/fp32 weights from HuggingFace are quantized as they are loaded into RAM.
  • Case B (Pre-quantized weights): A pass over nn.Module converts parameters from non-quantized to quantized form first, and then only ExternMapping is used to map the already-quantized parameters.

Exports

__all__ = ["ExternMapping", "QuantizeMapping"]

The module publicly exports both ExternMapping and QuantizeMapping, which are consumed by all model-specific loader modules across the codebase.

Usage Pattern

Every model-specific loader (e.g., Llama, Mistral, Gemma3) follows a consistent pattern:

  1. Instantiate the model and export its named parameters via TVM.
  2. Create an ExternMapping instance.
  3. Iterate over layers, calling add_mapping for parameters that require fusion (e.g., separate Q/K/V projections fused into a single QKV projection).
  4. Call add_unused for source parameters not needed by the MLC model.
  5. Fall through remaining parameters with identity mappings.

Dependencies

  • dataclasses -- standard library for the dataclass decorator
  • numpy -- used as the array type in mapping functions
  • tvm.runtime.Tensor -- used in quantize mapping function signatures

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment