Implementation:Mlc ai Mlc llm Jit

Knowledge Sources	MLC-LLM
Domains	Deep_Learning, Model_Serving, Compiler_Optimization
Last Updated	2026-02-09 00:00 GMT

Overview

Concrete tool for just-in-time compilation of model libraries provided by MLC-LLM.

Description

The jit function compiles an MLC-LLM model into a platform-specific shared library on demand at runtime. It reads the model's mlc-chat-config.json to extract the model type and quantization scheme, computes a deterministic MD5 hash over the full compilation configuration (model config, overrides, optimization flags, target device), and checks whether a cached compiled artifact already exists under MLC_LLM_HOME/model_lib/. On a cache hit, it returns the cached library path immediately. On a cache miss, it invokes mlc_llm compile as a subprocess, producing a shared object (.so, .dll, or .dylib) or a .tar archive for mobile targets, then atomically moves the result into the cache directory for future reuse.

The function respects the MLC_JIT_POLICY environment variable, which can be set to ON (default, compile and cache), OFF (disable JIT entirely), REDO (always recompile ignoring cache), or READONLY (only use cached artifacts, fail if not found). For mobile targets (iPhone, Android), the function also manages a system_lib_prefix to uniquely namespace the compiled system library.

Usage

Use this function when no pre-compiled model library is available and you want MLC-LLM to transparently compile the model on first use. This is the default behavior when model_lib is not explicitly provided to the engine constructor. It is also useful during development to quickly iterate on model configuration changes without a separate build step.

Code Reference

Source Location

Repository: MLC-LLM
File: python/mlc_llm/interface/jit.py (Lines 50-181)

Signature

def jit(
    model_path: Path,
    overrides: Dict[str, Any],
    device: Union[Device, str],
    system_lib_prefix: Optional[str] = None,
    *,
    skip_log_jit_policy: bool = False,
) -> JITResult:
    """Just-in-time compile a MLC-Chat model."""

Return Type

@dataclasses.dataclass
class JITResult:
    """The jit compilation result class."""
    model_lib_path: str
    system_lib_prefix: Optional[str] = None

Import

from mlc_llm.interface.jit import jit, JITResult

I/O Contract

Inputs

Name	Type	Required	Description
model_path	`pathlib.Path`	Yes	Path to the model directory containing `mlc-chat-config.json`.
overrides	`Dict[str, Any]`	Yes	Dictionary of model configuration overrides (e.g., `context_window_size`, `prefill_chunk_size`, `tensor_parallel_shards`, `opt`). The `opt` key, if present, specifies optimization flags (defaults to `"O2"`).
device	`Union[tvm.runtime.Device, str]`	Yes	Target device for compilation, such as `"cuda"`, `"metal"`, `"iphone"`, or `"android"`.
system_lib_prefix	`Optional[str]`	No	Optional prefix for the system library name. Auto-generated for mobile targets if not provided. Defaults to `None`.
skip_log_jit_policy	`bool`	No	Keyword-only. If `True`, suppresses logging of the current JIT policy. Defaults to `False`.

Outputs

Name	Type	Description
result	`JITResult`	A dataclass containing `model_lib_path` (str, the path to the compiled shared library) and `system_lib_prefix` (Optional[str], the system library prefix used for mobile targets).

Usage Examples

Basic Usage

from pathlib import Path
from mlc_llm.interface.jit import jit

# Compile a model for CUDA with default optimization
result = jit(
    model_path=Path("dist/models/Llama-2-7b-chat-hf-q4f16_1"),
    overrides={},
    device="cuda",
)
print(f"Compiled library at: {result.model_lib_path}")

With Configuration Overrides

from pathlib import Path
from mlc_llm.interface.jit import jit

# Compile with custom context window size and tensor parallelism
result = jit(
    model_path=Path("dist/models/Llama-2-7b-chat-hf-q4f16_1"),
    overrides={
        "context_window_size": 4096,
        "tensor_parallel_shards": 2,
        "opt": "O2",
    },
    device="cuda",
)
print(f"Compiled library at: {result.model_lib_path}")

Controlling JIT Policy via Environment Variable

import os
from pathlib import Path
from mlc_llm.interface.jit import jit

# Only use cached libraries, never compile
os.environ["MLC_JIT_POLICY"] = "READONLY"

try:
    result = jit(
        model_path=Path("dist/models/Llama-2-7b-chat-hf-q4f16_1"),
        overrides={},
        device="cuda",
    )
except RuntimeError as e:
    print(f"No cached library found: {e}")

Related Pages

Implements Principle

Principle:Mlc_ai_Mlc_llm_JIT_Model_Preparation

Environment and Heuristic Links

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment