Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Alibaba ROLL Adapter Utils

From Leeroopedia


Knowledge Sources
Domains Model_Architecture, LoRA
Last Updated 2026-02-07 20:00 GMT

Overview

Utility functions for discovering and annotating Megatron-Core model layers to support LoRA adapter application across linear, embedding, and router modules.

Description

This module provides a set of utility functions that traverse a Megatron-Core model's module hierarchy to identify specific layer types suitable for LoRA adapter attachment. The set_linear_is_expert function marks linear modules residing within expert layers with an is_expert flag, which is critical for Mixture-of-Experts (MoE) models where expert layers require distinct handling during adapter application. The core find_layers function implements a generic layer discovery mechanism that uses regex-based name normalization (replacing numeric indices with {} placeholders) to deduplicate repeated layer patterns, returning a list of unique module name patterns that match a given condition. Three convenience functions wrap find_layers to find all linear modules (TELinear, TEGroupedLinear, TELayerNormColumnParallelLinear), all embedding modules (LanguageModelEmbedding), and all router modules (TopKRouter).

Usage

Use these utilities when applying LoRA adapters to Megatron-Core models. Call set_linear_is_expert before adapter application to ensure expert layers are properly annotated. Use the find_all_* functions to discover target module names for adapter configuration, particularly when building LoRA target module lists dynamically based on the model architecture.

Code Reference

Source Location

Signature

def set_linear_is_expert(model) -> None: ...

def find_layers(model: PreTrainedModel, cond: Callable) -> list: ...

def find_all_linear_modules(model) -> list: ...

def find_all_embedding_modules(model) -> list: ...

def find_all_router_modules(model) -> list: ...

Import

from mcore_adapter.adapters.utils import (
    set_linear_is_expert,
    find_layers,
    find_all_linear_modules,
    find_all_embedding_modules,
    find_all_router_modules,
)

I/O Contract

Inputs

Name Type Required Description
model PreTrainedModel Yes The Megatron-Core model to inspect for layer discovery or annotation
cond Callable Yes (for find_layers) A callable that receives a module and returns True if it matches the target type

Outputs

Name Type Description
(set_linear_is_expert) None Mutates model in-place by setting is_expert = True on qualifying linear modules within expert layers
(find_layers) list A list of unique module name patterns matching the condition
(find_all_linear_modules) list List of names for all TELinear, TEGroupedLinear, and TELayerNormColumnParallelLinear modules
(find_all_embedding_modules) list List of names for all LanguageModelEmbedding modules
(find_all_router_modules) list List of names for all TopKRouter modules

Usage Examples

from mcore_adapter.adapters.utils import (
    set_linear_is_expert,
    find_all_linear_modules,
    find_all_embedding_modules,
    find_all_router_modules,
)

# Mark expert layers before applying LoRA
set_linear_is_expert(model)

# Discover all linear module names for LoRA targeting
linear_names = find_all_linear_modules(model)
# e.g., ["self_attention.linear_qkv", "self_attention.linear_proj", "mlp.linear_fc1", ...]

# Discover embedding modules
embedding_names = find_all_embedding_modules(model)
# e.g., ["embedding"]

# Discover router modules (for MoE models)
router_names = find_all_router_modules(model)
# e.g., ["mlp.router"]

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment