Implementation:Alibaba ROLL Convert Tool
| Knowledge Sources | |
|---|---|
| Domains | Checkpointing, CLI_Tools |
| Last Updated | 2026-02-07 20:00 GMT |
Overview
CLI tool for bidirectional checkpoint conversion between MCA (Megatron-Core Adapter) and HuggingFace formats, supporting LoRA adapters and precision casting.
Description
This module provides a command-line utility for converting model checkpoints between the MCA (Megatron-Core Adapter) internal format and the standard HuggingFace format. The conversion direction is automatically detected based on the presence of an mca_config.json file in the checkpoint or adapter directory.
ConvertArguments (lines 14-29): A @dataclass that defines the CLI arguments:
- checkpoint_path: Path to the source checkpoint
- adapter_path: Optional path to a LoRA adapter directory
- output_path: Destination path for the converted checkpoint (default "./output")
- bf16 / fp16: Precision flags (mutually exclusive, validated in __post_init__)
- convert_model_max_length: Optional override for the model's maximum sequence length in the output config
convert_mca_to_hf (lines 32-50): Handles MCA-to-HF conversion. Determines the target torch_dtype from the precision flags, then delegates to convert_checkpoint_to_hf from the post_converter module. After conversion, it loads the output config and optionally updates model_max_length and max_position_embeddings if the user specified convert_model_max_length.
main (lines 53-74): The entry point that parses CLI arguments using HuggingFace's HfArgumentParser into ConvertArguments and DistributingParallelArguments. It detects the conversion direction by checking for mca_config.json:
- If mca_config.json is found (in adapter_path or checkpoint_path): converts MCA to HF
- If mca_config.json is not found: converts HF to MCA using convert_checkpoint_to_mca
Usage
Run this tool from the command line to convert checkpoints between formats. Use HF-to-MCA conversion before training with Megatron-Core, and MCA-to-HF conversion after training to export the model back to HuggingFace format for inference or sharing.
Code Reference
Source Location
- Repository: Alibaba_ROLL
- File: mcore_adapter/tools/convert.py
- Lines: 1-77
Signature
@dataclass
class ConvertArguments:
checkpoint_path: str
adapter_path: str | None = None
output_path: str = "./output"
bf16: bool = False
fp16: bool = False
convert_model_max_length: Optional[int] = None
def __post_init__(self) -> None: ...
def convert_mca_to_hf(convert_args: ConvertArguments) -> None: ...
def main() -> None: ...
Import
from mcore_adapter.tools.convert import ConvertArguments, convert_mca_to_hf, main
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| checkpoint_path | str | Yes | Path to the source model checkpoint directory |
| adapter_path | str or None | No | Path to a LoRA adapter directory (if applicable) |
| output_path | str | No | Destination directory for converted checkpoint (default "./output") |
| bf16 | bool | No | Convert weights to bfloat16 precision (default False) |
| fp16 | bool | No | Convert weights to float16 precision (default False) |
| convert_model_max_length | int or None | No | Override model_max_length and max_position_embeddings in output config |
Outputs
| Name | Type | Description |
|---|---|---|
| (side effect) | files | Writes converted checkpoint files to output_path |
| (side effect) | config | Writes updated config.json / mca_config.json to output_path |
Usage Examples
# Command-line usage: Convert HuggingFace checkpoint to MCA format
# python mcore_adapter/tools/convert.py \
# --checkpoint_path /models/qwen2.5-7b \
# --output_path /checkpoints/qwen2.5-7b-mca \
# --bf16 \
# --tensor_model_parallel_size 4
# Command-line usage: Convert MCA checkpoint back to HuggingFace
# python mcore_adapter/tools/convert.py \
# --checkpoint_path /checkpoints/qwen2.5-7b-mca \
# --output_path /models/qwen2.5-7b-converted \
# --bf16
# Command-line usage: Convert with LoRA adapter
# python mcore_adapter/tools/convert.py \
# --checkpoint_path /checkpoints/base-model-mca \
# --adapter_path /checkpoints/lora-adapter \
# --output_path /models/merged-output \
# --bf16
# Programmatic usage
from mcore_adapter.tools.convert import ConvertArguments, convert_mca_to_hf
args = ConvertArguments(
checkpoint_path="/checkpoints/qwen2.5-7b-mca",
output_path="/models/qwen2.5-7b-hf",
bf16=True,
convert_model_max_length=8192,
)
convert_mca_to_hf(args)