Workflow:Predibase Lorax Multi Adapter Merging
| Knowledge Sources | |
|---|---|
| Domains | LLM_Ops, Inference, Model_Merging |
| Last Updated | 2026-02-08 03:00 GMT |
Overview
End-to-end process for merging multiple LoRA adapters per request to create task-specific ensembles using linear, TIES, or DARE merge strategies.
Description
This workflow describes how to combine two or more LoRA adapters into a single merged adapter at inference time. Rather than serving individual task-specific adapters, multiple adapters are blended together using configurable merge strategies (linear weighted average, TIES with sign consensus, DARE with random pruning, or DARE+TIES). This enables creating multi-task models on-the-fly without retraining, making the LLM capable of handling diverse tasks from a single merged request.
Usage
Execute this workflow when you have multiple specialized LoRA adapters (e.g., one for SQL generation, one for ad copy, one for instruction following) and want to serve them as a unified multi-task model. You need a running LoRAX server with the base model that all adapters were trained on.
Execution Steps
Step 1: Adapter_Inventory
Identify the set of LoRA adapters to merge. All adapters must be trained on the same base model currently deployed in the LoRAX server. Evaluate each adapter's specialization to understand what capabilities the merged ensemble will have.
Key considerations:
- All adapters must share the same base model architecture
- Adapters can come from different sources (HuggingFace Hub, S3, local)
- Consider which tasks each adapter excels at to inform weight assignment
Step 2: Strategy_Selection
Choose a merge strategy and configure its parameters. Four strategies are available, each with different trade-offs for handling interference between adapter weight spaces.
Available strategies:
- Linear (default): Weighted average of all adapter parameters. Simple and effective for small numbers of adapters
- TIES: Uses task arithmetic with sparsification and sign-based consensus. Scales well to many adapters while retaining individual strengths. Requires density parameter
- DARE Linear: Random pruning with rescaling to reduce interference. Requires density parameter
- DARE TIES: Combines DARE random pruning with TIES sign consensus. Requires density and majority_sign_method parameters
Step 3: Weight_Configuration
Assign relative weights to each adapter in the ensemble. Weights control the influence of each adapter on the final merged output. Higher weights increase a given adapter's contribution to the merged result.
Key considerations:
- Weights default to 1.0 for all adapters if not specified
- For TIES and DARE strategies, the density parameter controls sparsity (fraction of weights retained)
- The majority_sign_method parameter (total or frequency) affects TIES consensus
Step 4: Merged_Inference
Submit the inference request with the MergedAdapters configuration. The server loads all specified adapters (if not already cached), applies the selected merge strategy to combine their weight matrices, and performs inference with the resulting merged adapter.
What happens internally:
- Each adapter is downloaded and cached independently
- Merge strategies operate on the LoRA A/B weight matrices
- Linear merge computes weighted average of adapter weights
- TIES/DARE apply sparsification before merging to reduce interference
- The merged weights are used for a single forward pass through the model
Step 5: Result_Evaluation
Evaluate the merged model's output quality across the different task domains. Test with prompts from each adapter's specialty to verify the ensemble retains the individual adapter capabilities.
Key considerations:
- Merged outputs may differ from individual adapter outputs
- Iterative tuning of weights and density can improve results
- Some merge strategies work better for specific adapter combinations