Implementation:Mlc ai Mlc llm Mlc package config

Knowledge Sources	MLC-LLM
Domains	Deep_Learning, Mobile_Deployment
Last Updated	2026-02-09 00:00 GMT

Overview

Concrete configuration file for declarative model selection and packaging for mobile deployment provided by MLC-LLM.

Description

The mlc-package-config.json file is a JSON configuration file placed in the root of an MLC-LLM mobile application project (e.g., ios/MLCChat/ or android/MLCChat/). It is consumed by the mlc_llm package command to determine which models to compile, how to configure them, and whether to bundle their weights into the application package.

The file supports two device targets: iphone for iOS builds and android for Android builds. Each entry in the model_list array specifies a model source (typically a Hugging Face path prefixed with HF://), a unique identifier, an estimated VRAM budget in bytes, optional compilation overrides, and a weight bundling flag.

The packaging pipeline reads this file, downloads model artifacts as needed, JIT-compiles model libraries if not already cached, bundles weights for models with bundle_weight: true, and produces the final mlc-app-config.json consumed by the runtime engine.

Usage

Use this configuration file when:

Setting up a new MLC-LLM mobile application project with a specific set of models
Adding or removing models from an existing mobile deployment
Adjusting compilation parameters like prefill_chunk_size or context_window_size for memory-constrained devices
Enabling or disabling weight bundling for individual models

Code Reference

Source Location

Repository: MLC-LLM
File (iOS): ios/MLCChat/mlc-package-config.json (Lines 1-49)
File (Android): android/MLCChat/mlc-package-config.json (Lines 1-49)

Signature

{
    "device": "<iphone|android>",
    "model_list": [
        {
            "model": "<string: HF path or local path>",
            "model_id": "<string: unique identifier>",
            "estimated_vram_bytes": "<integer: bytes>",
            "overrides": {
                "prefill_chunk_size": "<integer>",
                "context_window_size": "<integer>",
                "sliding_window_size": "<integer>"
            },
            "bundle_weight": "<boolean: default false>"
        }
    ]
}

I/O Contract

Inputs

Name	Type	Required	Description
`device`	string	Yes	Target platform: `"iphone"` or `"android"`
`model_list`	array	Yes	Array of model entry objects
`model_list[].model`	string	Yes	Model source path, typically `HF://mlc-ai/<model-name>` for Hugging Face hosted models
`model_list[].model_id`	string	Yes	Unique identifier for the model, used as the runtime reference and weight directory name
`model_list[].estimated_vram_bytes`	integer	Yes	Estimated peak VRAM consumption in bytes for memory management
`model_list[].overrides`	object	No	Dictionary of compilation parameter overrides (e.g., `prefill_chunk_size`, `context_window_size`)
`model_list[].bundle_weight`	boolean	No	If true, model weights are bundled into the app package (default: false)

Outputs

Name	Type	Description
`mlc-app-config.json`	JSON file	Generated runtime configuration consumed by the mobile app engine, listing model IDs, library names, and model URLs or paths
Compiled model libraries	Static/shared libraries	Platform-specific compiled model inference libraries
Bundled weight directories	Directory	Model weight files copied into the output bundle (only for models with `bundle_weight: true`)

Usage Examples

iOS Configuration with Bundled Weights

{
    "device": "iphone",
    "model_list": [
        {
            "model": "HF://mlc-ai/Llama-3.2-3B-Instruct-q4f16_1-MLC",
            "model_id": "Llama-3.2-3B-Instruct-q4f16_1-MLC",
            "estimated_vram_bytes": 3000000000,
            "overrides": {
                "prefill_chunk_size": 128,
                "context_window_size": 2048
            },
            "bundle_weight": true
        }
    ]
}

Android Configuration with Multiple Models

{
    "device": "android",
    "model_list": [
        {
            "model": "HF://mlc-ai/Phi-3.5-mini-instruct-q4f16_0-MLC",
            "estimated_vram_bytes": 4250586449,
            "model_id": "Phi-3.5-mini-instruct-q4f16_0-MLC",
            "overrides": {
                "prefill_chunk_size": 128
            }
        },
        {
            "model": "HF://mlc-ai/gemma-2-2b-it-q4f16_1-MLC",
            "model_id": "gemma-2-2b-it-q4f16_1-MLC",
            "estimated_vram_bytes": 3000000000
        },
        {
            "model": "HF://mlc-ai/Mistral-7B-Instruct-v0.3-q4f16_1-MLC",
            "estimated_vram_bytes": 4115131883,
            "model_id": "Mistral-7B-Instruct-v0.3-q4f16_1-MLC",
            "overrides": {
                "sliding_window_size": 768,
                "prefill_chunk_size": 256
            }
        }
    ]
}

Running the Package Command

# Package for iOS using the config file
mlc_llm package \
    --package-config ios/MLCChat/mlc-package-config.json \
    --mlc-llm-source-dir /path/to/mlc-llm \
    --output dist/

# Package for Android using the config file
mlc_llm package \
    --package-config android/MLCChat/mlc-package-config.json \
    --mlc-llm-source-dir /path/to/mlc-llm \
    --output dist/

Related Pages

Implements Principle

Principle:Mlc_ai_Mlc_llm_Model_Packaging_Configuration

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment