Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Alibaba ROLL Convert Tool

From Leeroopedia


Knowledge Sources
Domains Checkpointing, CLI_Tools
Last Updated 2026-02-07 20:00 GMT

Overview

CLI tool for bidirectional checkpoint conversion between MCA (Megatron-Core Adapter) and HuggingFace formats, supporting LoRA adapters and precision casting.

Description

This module provides a command-line utility for converting model checkpoints between the MCA (Megatron-Core Adapter) internal format and the standard HuggingFace format. The conversion direction is automatically detected based on the presence of an mca_config.json file in the checkpoint or adapter directory.

ConvertArguments (lines 14-29): A @dataclass that defines the CLI arguments:

  • checkpoint_path: Path to the source checkpoint
  • adapter_path: Optional path to a LoRA adapter directory
  • output_path: Destination path for the converted checkpoint (default "./output")
  • bf16 / fp16: Precision flags (mutually exclusive, validated in __post_init__)
  • convert_model_max_length: Optional override for the model's maximum sequence length in the output config

convert_mca_to_hf (lines 32-50): Handles MCA-to-HF conversion. Determines the target torch_dtype from the precision flags, then delegates to convert_checkpoint_to_hf from the post_converter module. After conversion, it loads the output config and optionally updates model_max_length and max_position_embeddings if the user specified convert_model_max_length.

main (lines 53-74): The entry point that parses CLI arguments using HuggingFace's HfArgumentParser into ConvertArguments and DistributingParallelArguments. It detects the conversion direction by checking for mca_config.json:

  • If mca_config.json is found (in adapter_path or checkpoint_path): converts MCA to HF
  • If mca_config.json is not found: converts HF to MCA using convert_checkpoint_to_mca

Usage

Run this tool from the command line to convert checkpoints between formats. Use HF-to-MCA conversion before training with Megatron-Core, and MCA-to-HF conversion after training to export the model back to HuggingFace format for inference or sharing.

Code Reference

Source Location

Signature

@dataclass
class ConvertArguments:
    checkpoint_path: str
    adapter_path: str | None = None
    output_path: str = "./output"
    bf16: bool = False
    fp16: bool = False
    convert_model_max_length: Optional[int] = None

    def __post_init__(self) -> None: ...

def convert_mca_to_hf(convert_args: ConvertArguments) -> None: ...

def main() -> None: ...

Import

from mcore_adapter.tools.convert import ConvertArguments, convert_mca_to_hf, main

I/O Contract

Inputs

Name Type Required Description
checkpoint_path str Yes Path to the source model checkpoint directory
adapter_path str or None No Path to a LoRA adapter directory (if applicable)
output_path str No Destination directory for converted checkpoint (default "./output")
bf16 bool No Convert weights to bfloat16 precision (default False)
fp16 bool No Convert weights to float16 precision (default False)
convert_model_max_length int or None No Override model_max_length and max_position_embeddings in output config

Outputs

Name Type Description
(side effect) files Writes converted checkpoint files to output_path
(side effect) config Writes updated config.json / mca_config.json to output_path

Usage Examples

# Command-line usage: Convert HuggingFace checkpoint to MCA format
# python mcore_adapter/tools/convert.py \
#     --checkpoint_path /models/qwen2.5-7b \
#     --output_path /checkpoints/qwen2.5-7b-mca \
#     --bf16 \
#     --tensor_model_parallel_size 4

# Command-line usage: Convert MCA checkpoint back to HuggingFace
# python mcore_adapter/tools/convert.py \
#     --checkpoint_path /checkpoints/qwen2.5-7b-mca \
#     --output_path /models/qwen2.5-7b-converted \
#     --bf16

# Command-line usage: Convert with LoRA adapter
# python mcore_adapter/tools/convert.py \
#     --checkpoint_path /checkpoints/base-model-mca \
#     --adapter_path /checkpoints/lora-adapter \
#     --output_path /models/merged-output \
#     --bf16

# Programmatic usage
from mcore_adapter.tools.convert import ConvertArguments, convert_mca_to_hf

args = ConvertArguments(
    checkpoint_path="/checkpoints/qwen2.5-7b-mca",
    output_path="/models/qwen2.5-7b-hf",
    bf16=True,
    convert_model_max_length=8192,
)
convert_mca_to_hf(args)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment