Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Mlc ai Mlc llm Mlc package config

From Leeroopedia


Knowledge Sources
Domains Deep_Learning, Mobile_Deployment
Last Updated 2026-02-09 00:00 GMT

Overview

Concrete configuration file for declarative model selection and packaging for mobile deployment provided by MLC-LLM.

Description

The mlc-package-config.json file is a JSON configuration file placed in the root of an MLC-LLM mobile application project (e.g., ios/MLCChat/ or android/MLCChat/). It is consumed by the mlc_llm package command to determine which models to compile, how to configure them, and whether to bundle their weights into the application package.

The file supports two device targets: iphone for iOS builds and android for Android builds. Each entry in the model_list array specifies a model source (typically a Hugging Face path prefixed with HF://), a unique identifier, an estimated VRAM budget in bytes, optional compilation overrides, and a weight bundling flag.

The packaging pipeline reads this file, downloads model artifacts as needed, JIT-compiles model libraries if not already cached, bundles weights for models with bundle_weight: true, and produces the final mlc-app-config.json consumed by the runtime engine.

Usage

Use this configuration file when:

  • Setting up a new MLC-LLM mobile application project with a specific set of models
  • Adding or removing models from an existing mobile deployment
  • Adjusting compilation parameters like prefill_chunk_size or context_window_size for memory-constrained devices
  • Enabling or disabling weight bundling for individual models

Code Reference

Source Location

  • Repository: MLC-LLM
  • File (iOS): ios/MLCChat/mlc-package-config.json (Lines 1-49)
  • File (Android): android/MLCChat/mlc-package-config.json (Lines 1-49)

Signature

{
    "device": "<iphone|android>",
    "model_list": [
        {
            "model": "<string: HF path or local path>",
            "model_id": "<string: unique identifier>",
            "estimated_vram_bytes": "<integer: bytes>",
            "overrides": {
                "prefill_chunk_size": "<integer>",
                "context_window_size": "<integer>",
                "sliding_window_size": "<integer>"
            },
            "bundle_weight": "<boolean: default false>"
        }
    ]
}

I/O Contract

Inputs

Name Type Required Description
device string Yes Target platform: "iphone" or "android"
model_list array Yes Array of model entry objects
model_list[].model string Yes Model source path, typically HF://mlc-ai/<model-name> for Hugging Face hosted models
model_list[].model_id string Yes Unique identifier for the model, used as the runtime reference and weight directory name
model_list[].estimated_vram_bytes integer Yes Estimated peak VRAM consumption in bytes for memory management
model_list[].overrides object No Dictionary of compilation parameter overrides (e.g., prefill_chunk_size, context_window_size)
model_list[].bundle_weight boolean No If true, model weights are bundled into the app package (default: false)

Outputs

Name Type Description
mlc-app-config.json JSON file Generated runtime configuration consumed by the mobile app engine, listing model IDs, library names, and model URLs or paths
Compiled model libraries Static/shared libraries Platform-specific compiled model inference libraries
Bundled weight directories Directory Model weight files copied into the output bundle (only for models with bundle_weight: true)

Usage Examples

iOS Configuration with Bundled Weights

{
    "device": "iphone",
    "model_list": [
        {
            "model": "HF://mlc-ai/Llama-3.2-3B-Instruct-q4f16_1-MLC",
            "model_id": "Llama-3.2-3B-Instruct-q4f16_1-MLC",
            "estimated_vram_bytes": 3000000000,
            "overrides": {
                "prefill_chunk_size": 128,
                "context_window_size": 2048
            },
            "bundle_weight": true
        }
    ]
}

Android Configuration with Multiple Models

{
    "device": "android",
    "model_list": [
        {
            "model": "HF://mlc-ai/Phi-3.5-mini-instruct-q4f16_0-MLC",
            "estimated_vram_bytes": 4250586449,
            "model_id": "Phi-3.5-mini-instruct-q4f16_0-MLC",
            "overrides": {
                "prefill_chunk_size": 128
            }
        },
        {
            "model": "HF://mlc-ai/gemma-2-2b-it-q4f16_1-MLC",
            "model_id": "gemma-2-2b-it-q4f16_1-MLC",
            "estimated_vram_bytes": 3000000000
        },
        {
            "model": "HF://mlc-ai/Mistral-7B-Instruct-v0.3-q4f16_1-MLC",
            "estimated_vram_bytes": 4115131883,
            "model_id": "Mistral-7B-Instruct-v0.3-q4f16_1-MLC",
            "overrides": {
                "sliding_window_size": 768,
                "prefill_chunk_size": 256
            }
        }
    ]
}

Running the Package Command

# Package for iOS using the config file
mlc_llm package \
    --package-config ios/MLCChat/mlc-package-config.json \
    --mlc-llm-source-dir /path/to/mlc-llm \
    --output dist/

# Package for Android using the config file
mlc_llm package \
    --package-config android/MLCChat/mlc-package-config.json \
    --mlc-llm-source-dir /path/to/mlc-llm \
    --output dist/

Related Pages

Implements Principle

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment