Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Workflow:Predibase Lorax Multi Adapter Merging

From Leeroopedia


Knowledge Sources
Domains LLM_Ops, Inference, Model_Merging
Last Updated 2026-02-08 03:00 GMT

Overview

End-to-end process for merging multiple LoRA adapters per request to create task-specific ensembles using linear, TIES, or DARE merge strategies.

Description

This workflow describes how to combine two or more LoRA adapters into a single merged adapter at inference time. Rather than serving individual task-specific adapters, multiple adapters are blended together using configurable merge strategies (linear weighted average, TIES with sign consensus, DARE with random pruning, or DARE+TIES). This enables creating multi-task models on-the-fly without retraining, making the LLM capable of handling diverse tasks from a single merged request.

Usage

Execute this workflow when you have multiple specialized LoRA adapters (e.g., one for SQL generation, one for ad copy, one for instruction following) and want to serve them as a unified multi-task model. You need a running LoRAX server with the base model that all adapters were trained on.

Execution Steps

Step 1: Adapter_Inventory

Identify the set of LoRA adapters to merge. All adapters must be trained on the same base model currently deployed in the LoRAX server. Evaluate each adapter's specialization to understand what capabilities the merged ensemble will have.

Key considerations:

  • All adapters must share the same base model architecture
  • Adapters can come from different sources (HuggingFace Hub, S3, local)
  • Consider which tasks each adapter excels at to inform weight assignment

Step 2: Strategy_Selection

Choose a merge strategy and configure its parameters. Four strategies are available, each with different trade-offs for handling interference between adapter weight spaces.

Available strategies:

  • Linear (default): Weighted average of all adapter parameters. Simple and effective for small numbers of adapters
  • TIES: Uses task arithmetic with sparsification and sign-based consensus. Scales well to many adapters while retaining individual strengths. Requires density parameter
  • DARE Linear: Random pruning with rescaling to reduce interference. Requires density parameter
  • DARE TIES: Combines DARE random pruning with TIES sign consensus. Requires density and majority_sign_method parameters

Step 3: Weight_Configuration

Assign relative weights to each adapter in the ensemble. Weights control the influence of each adapter on the final merged output. Higher weights increase a given adapter's contribution to the merged result.

Key considerations:

  • Weights default to 1.0 for all adapters if not specified
  • For TIES and DARE strategies, the density parameter controls sparsity (fraction of weights retained)
  • The majority_sign_method parameter (total or frequency) affects TIES consensus

Step 4: Merged_Inference

Submit the inference request with the MergedAdapters configuration. The server loads all specified adapters (if not already cached), applies the selected merge strategy to combine their weight matrices, and performs inference with the resulting merged adapter.

What happens internally:

  • Each adapter is downloaded and cached independently
  • Merge strategies operate on the LoRA A/B weight matrices
  • Linear merge computes weighted average of adapter weights
  • TIES/DARE apply sparsification before merging to reduce interference
  • The merged weights are used for a single forward pass through the model

Step 5: Result_Evaluation

Evaluate the merged model's output quality across the different task domains. Test with prompts from each adapter's specialty to verify the ensemble retains the individual adapter capabilities.

Key considerations:

  • Merged outputs may differ from individual adapter outputs
  • Iterative tuning of weights and density can improve results
  • Some merge strategies work better for specific adapter combinations

Execution Diagram

GitHub URL

Workflow Repository