Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Alibaba ROLL Qwen3OmniMoe Registration

From Leeroopedia


Knowledge Sources
Domains Model_Architecture, Multimodal, Configuration
Last Updated 2026-02-07 20:00 GMT

Overview

Registers the Qwen3-Omni multimodal Mixture-of-Experts model with the mcore_adapter framework, defining weight conversion templates and distributed parallelism configuration.

Description

This module serves as the registration entry point for the Qwen3-Omni MoE model within the mcore_adapter ecosystem. It performs three principal registration tasks:

1. Qwen3OmniMoeTemplate class (lines 22-37): A @dataclass subclass of Template that overrides adjust_config_hf_to_mca to handle the hierarchical config structure of multi-modal HuggingFace models. The Qwen3-Omni HF config nests text parameters under thinker_config.text_config and multimodal token IDs under thinker_config, while audio output parameters (enable_audio_output, talker_config, code2wav_config) remain at the top level. This method remaps all HF config keys to their correct prefixed paths.

2. Distributed config registration (lines 40-49): Registers the MoE distribution configuration via register_dist_config, merging default and shared MoE configs with model-specific settings. Vision and audio model weights are designated as pre_process_weights (loaded on first pipeline stage), while talker and code2wav weights are post_process_weights (last stage). All non-text model weights are marked as duplicated_weights (replicated across pipeline stages).

3. Template registration (lines 57-140): Registers the full weight conversion template with:

  • 63 config mappings from HF to MCA format covering attention, MoE, vision, audio, and speech output parameters
  • 10 constant MCA config values (e.g., swiglu activation, mrope position embedding, RMSNorm normalization)
  • 15 weight converter operations including RenameConverOp for direct weight renaming, QKVConverOp and QKVBiasConverOp for fusing separate Q/K/V projections, and StackConverOp for concatenating gate and up projections

Usage

This module is imported as a side effect during model registration. Import the package mcore_adapter.models.qwen3_omni to register the Qwen3-Omni MoE model type, enabling AutoConfig.from_pretrained and checkpoint conversion tools to handle this model architecture.

Code Reference

Source Location

Signature

@dataclass
class Qwen3OmniMoeTemplate(Template):
    def adjust_config_hf_to_mca(self) -> dict: ...

# Side-effect registrations:
register_dist_config("qwen3_omni_moe", ...)
register_template("qwen3_omni_moe", ...)

Import

from mcore_adapter.models.qwen3_omni import Qwen3OmniMoeConfig, Qwen3OmniMoeModel

I/O Contract

Inputs

Name Type Required Description
(module import) N/A Yes Importing the module triggers registration of the model type "qwen3_omni_moe" with the config, dist config, and template registries

Outputs

Name Type Description
Qwen3OmniMoeConfig class The registered configuration class for the Qwen3-Omni MoE model
Qwen3OmniMoeModel class The model class for the Qwen3-Omni MoE architecture
(side effects) N/A Registers dist config and weight conversion template with global registries

Usage Examples

# Importing the package triggers all registrations
import mcore_adapter.models.qwen3_omni

# Now AutoConfig can resolve "qwen3_omni_moe" model type
from mcore_adapter.models.auto.config_auto import AutoConfig

config = AutoConfig.from_pretrained("/path/to/qwen3-omni-moe-checkpoint")
print(config.num_moe_experts)
print(config.vision_config)

# The template is now available for checkpoint conversion
from mcore_adapter.models.converter.template import get_template
template = get_template("qwen3_omni_moe")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment