Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Mlc ai Mlc llm CLI Calibrate

From Leeroopedia


Overview

The file python/mlc_llm/cli/calibrate.py implements the command-line interface for the calibration subcommand of MLC LLM. Calibration is the process of collecting activation statistics from a model by running it on a representative dataset, which is used to inform quantization decisions. This module parses CLI arguments and delegates to the core calibrate() function in mlc_llm.interface.calibrate.

Location

  • Repository: Mlc_ai_Mlc_llm
  • File: python/mlc_llm/cli/calibrate.py
  • Lines: 80

CLI Arguments

The module defines the following command-line arguments:

Argument Type Default Required Description
model (positional) str -- Yes The model to calibrate.
--device str "auto" No The device to deploy the model on.
--model-lib str None No Path to the compiled model library.
--output / -o str -- Yes Output path for calibration data.
--dataset str -- Yes Path to the calibration dataset (e.g., ShareGPT format).
--num-calibration-samples int 16 No Number of samples to use for calibration.
--seed int 0 No Random seed for reproducible sample selection.
--overrides EngineConfigOverride "" No Engine configuration overrides for serving parameters.

Implementation Details

Imports and Dependencies

from mlc_llm.interface.calibrate import calibrate
from mlc_llm.interface.help import HELP
from mlc_llm.support.argparse import ArgumentParser

from .serve import EngineConfigOverride

The module imports:

  • calibrate from mlc_llm.interface.calibrate -- the core calibration logic.
  • HELP from mlc_llm.interface.help -- a shared dictionary of help text strings for consistent CLI documentation.
  • EngineConfigOverride from mlc_llm.cli.serve -- a dataclass that parses engine configuration override strings (e.g., memory utilization, chunk sizes).

Main Function

The main(argv) function constructs the argument parser, parses the arguments, and calls the core calibrate() function:

def main(argv):
    """Main entrypoint for calibration."""
    parser = ArgumentParser("MLC LLM Calibration CLI")
    # ... argument definitions ...
    parsed = parser.parse_args(argv)
    calibrate(
        model=parsed.model,
        device=parsed.device,
        model_lib=parsed.model_lib,
        output=parsed.output,
        dataset=parsed.dataset,
        num_calibration_samples=parsed.num_calibration_samples,
        max_num_sequence=parsed.overrides.max_num_sequence,
        max_total_sequence_length=parsed.overrides.max_total_seq_length,
        prefill_chunk_size=parsed.overrides.prefill_chunk_size,
        max_history_size=parsed.overrides.max_history_size,
        gpu_memory_utilization=parsed.overrides.gpu_memory_utilization,
        seed=parsed.seed,
    )

EngineConfigOverride Integration

The --overrides argument uses the EngineConfigOverride.from_str classmethod as its type converter. This allows users to pass engine configuration parameters as a semicolon-separated string. The parsed override fields that are forwarded to the calibration function include:

  • max_num_sequence -- Maximum number of sequences processed concurrently.
  • max_total_seq_length -- Maximum total sequence length across all sequences.
  • prefill_chunk_size -- Size of chunks for prefill operations.
  • max_history_size -- Maximum history window size.
  • gpu_memory_utilization -- Fraction of GPU memory available for inference.

Dataset Reference

The source code includes a comment noting the recommended calibration dataset:

# Download dataset from
# https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/resolve/main/ShareGPT_V3_unfiltered_cleaned_split.json

This is the ShareGPT Vicuna dataset in JSON format, which provides multi-turn conversation data suitable for calibrating chat-oriented language models.

Design Notes

  • Help text is sourced from the shared HELP dictionary rather than being defined inline, ensuring consistent documentation across all CLI subcommands.
  • The module follows the standard MLC LLM CLI contract: it exposes a main(argv) function that receives the remaining argument list from the top-level dispatcher in __main__.py.
  • The thin CLI layer delegates all heavy logic to the mlc_llm.interface.calibrate module, maintaining separation between argument parsing and core functionality.

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment