Implementation:Alibaba ROLL Qwen3OmniMoe Registration
| Knowledge Sources | |
|---|---|
| Domains | Model_Architecture, Multimodal, Configuration |
| Last Updated | 2026-02-07 20:00 GMT |
Overview
Registers the Qwen3-Omni multimodal Mixture-of-Experts model with the mcore_adapter framework, defining weight conversion templates and distributed parallelism configuration.
Description
This module serves as the registration entry point for the Qwen3-Omni MoE model within the mcore_adapter ecosystem. It performs three principal registration tasks:
1. Qwen3OmniMoeTemplate class (lines 22-37): A @dataclass subclass of Template that overrides adjust_config_hf_to_mca to handle the hierarchical config structure of multi-modal HuggingFace models. The Qwen3-Omni HF config nests text parameters under thinker_config.text_config and multimodal token IDs under thinker_config, while audio output parameters (enable_audio_output, talker_config, code2wav_config) remain at the top level. This method remaps all HF config keys to their correct prefixed paths.
2. Distributed config registration (lines 40-49): Registers the MoE distribution configuration via register_dist_config, merging default and shared MoE configs with model-specific settings. Vision and audio model weights are designated as pre_process_weights (loaded on first pipeline stage), while talker and code2wav weights are post_process_weights (last stage). All non-text model weights are marked as duplicated_weights (replicated across pipeline stages).
3. Template registration (lines 57-140): Registers the full weight conversion template with:
- 63 config mappings from HF to MCA format covering attention, MoE, vision, audio, and speech output parameters
- 10 constant MCA config values (e.g., swiglu activation, mrope position embedding, RMSNorm normalization)
- 15 weight converter operations including RenameConverOp for direct weight renaming, QKVConverOp and QKVBiasConverOp for fusing separate Q/K/V projections, and StackConverOp for concatenating gate and up projections
Usage
This module is imported as a side effect during model registration. Import the package mcore_adapter.models.qwen3_omni to register the Qwen3-Omni MoE model type, enabling AutoConfig.from_pretrained and checkpoint conversion tools to handle this model architecture.
Code Reference
Source Location
- Repository: Alibaba_ROLL
- File: mcore_adapter/src/mcore_adapter/models/qwen3_omni/__init__.py
- Lines: 1-142
Signature
@dataclass
class Qwen3OmniMoeTemplate(Template):
def adjust_config_hf_to_mca(self) -> dict: ...
# Side-effect registrations:
register_dist_config("qwen3_omni_moe", ...)
register_template("qwen3_omni_moe", ...)
Import
from mcore_adapter.models.qwen3_omni import Qwen3OmniMoeConfig, Qwen3OmniMoeModel
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| (module import) | N/A | Yes | Importing the module triggers registration of the model type "qwen3_omni_moe" with the config, dist config, and template registries |
Outputs
| Name | Type | Description |
|---|---|---|
| Qwen3OmniMoeConfig | class | The registered configuration class for the Qwen3-Omni MoE model |
| Qwen3OmniMoeModel | class | The model class for the Qwen3-Omni MoE architecture |
| (side effects) | N/A | Registers dist config and weight conversion template with global registries |
Usage Examples
# Importing the package triggers all registrations
import mcore_adapter.models.qwen3_omni
# Now AutoConfig can resolve "qwen3_omni_moe" model type
from mcore_adapter.models.auto.config_auto import AutoConfig
config = AutoConfig.from_pretrained("/path/to/qwen3-omni-moe-checkpoint")
print(config.num_moe_experts)
print(config.vision_config)
# The template is now available for checkpoint conversion
from mcore_adapter.models.converter.template import get_template
template = get_template("qwen3_omni_moe")