Implementation:Alibaba ROLL Qwen3OmniMoe Registration

Knowledge Sources	Alibaba_ROLL
Domains	Model_Architecture, Multimodal, Configuration
Last Updated	2026-02-07 20:00 GMT

Overview

Registers the Qwen3-Omni multimodal Mixture-of-Experts model with the mcore_adapter framework, defining weight conversion templates and distributed parallelism configuration.

Description

This module serves as the registration entry point for the Qwen3-Omni MoE model within the mcore_adapter ecosystem. It performs three principal registration tasks:

1. Qwen3OmniMoeTemplate class (lines 22-37): A @dataclass subclass of Template that overrides adjust_config_hf_to_mca to handle the hierarchical config structure of multi-modal HuggingFace models. The Qwen3-Omni HF config nests text parameters under thinker_config.text_config and multimodal token IDs under thinker_config, while audio output parameters (enable_audio_output, talker_config, code2wav_config) remain at the top level. This method remaps all HF config keys to their correct prefixed paths.

2. Distributed config registration (lines 40-49): Registers the MoE distribution configuration via register_dist_config, merging default and shared MoE configs with model-specific settings. Vision and audio model weights are designated as pre_process_weights (loaded on first pipeline stage), while talker and code2wav weights are post_process_weights (last stage). All non-text model weights are marked as duplicated_weights (replicated across pipeline stages).

3. Template registration (lines 57-140): Registers the full weight conversion template with:

63 config mappings from HF to MCA format covering attention, MoE, vision, audio, and speech output parameters
10 constant MCA config values (e.g., swiglu activation, mrope position embedding, RMSNorm normalization)
15 weight converter operations including RenameConverOp for direct weight renaming, QKVConverOp and QKVBiasConverOp for fusing separate Q/K/V projections, and StackConverOp for concatenating gate and up projections

Usage

This module is imported as a side effect during model registration. Import the package mcore_adapter.models.qwen3_omni to register the Qwen3-Omni MoE model type, enabling AutoConfig.from_pretrained and checkpoint conversion tools to handle this model architecture.

Code Reference

Source Location

Repository: Alibaba_ROLL
File: mcore_adapter/src/mcore_adapter/models/qwen3_omni/__init__.py
Lines: 1-142

Signature

@dataclass
class Qwen3OmniMoeTemplate(Template):
    def adjust_config_hf_to_mca(self) -> dict: ...

# Side-effect registrations:
register_dist_config("qwen3_omni_moe", ...)
register_template("qwen3_omni_moe", ...)

Import

from mcore_adapter.models.qwen3_omni import Qwen3OmniMoeConfig, Qwen3OmniMoeModel

I/O Contract

Inputs

Name	Type	Required	Description
(module import)	N/A	Yes	Importing the module triggers registration of the model type "qwen3_omni_moe" with the config, dist config, and template registries

Outputs

Name	Type	Description
Qwen3OmniMoeConfig	class	The registered configuration class for the Qwen3-Omni MoE model
Qwen3OmniMoeModel	class	The model class for the Qwen3-Omni MoE architecture
(side effects)	N/A	Registers dist config and weight conversion template with global registries

Usage Examples

# Importing the package triggers all registrations
import mcore_adapter.models.qwen3_omni

# Now AutoConfig can resolve "qwen3_omni_moe" model type
from mcore_adapter.models.auto.config_auto import AutoConfig

config = AutoConfig.from_pretrained("/path/to/qwen3-omni-moe-checkpoint")
print(config.num_moe_experts)
print(config.vision_config)

# The template is now available for checkpoint conversion
from mcore_adapter.models.converter.template import get_template
template = get_template("qwen3_omni_moe")

Related Pages

Environment:Alibaba_ROLL_Python_Runtime_Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment