Principle:FlagOpen FlagEmbedding Multi Task Retrieval Embedder
| Knowledge Sources | |
|---|---|
| Domains | Machine Learning, Large Language Models, Multi-Task Learning, Information Retrieval |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
Multi-task retrieval training for LLM embedders that jointly optimizes across diverse downstream tasks including semantic search, question answering, and in-context example retrieval to create universal embedding models.
Description
This principle addresses the challenge of creating a single LLM-based embedding model that excels across a wide variety of retrieval scenarios. The approach trains on a diverse mixture of datasets covering different retrieval types: asymmetric search (queries vs documents), symmetric similarity (text-to-text), QA pairs, code search, and in-context example selection. The training framework uses specialized loss functions for different task types, including dense retrieval losses, language modeling objectives for retrieval-augmented generation, and sentence representation learning metrics (SRLM). The system incorporates task-specific preprocessing, custom evaluation metrics (MRR, recall@k, nDCG), and careful batch construction to balance diverse tasks. This multi-task approach creates more robust embeddings that generalize better to unseen domains compared to single-task training.
Usage
Use this principle when:
- Building universal embedding models for production systems
- Training embedders that handle diverse retrieval scenarios
- Developing LLM-based retrievers for RAG applications
- Creating embeddings that work across multiple domains without fine-tuning
Theoretical Basis
The multi-task retrieval framework consists of:
- Task Taxonomy:
- Dense retrieval: Query-document matching with contrastive loss
- Symmetric similarity: Text pair similarity with symmetric loss
- In-context learning: Example retrieval for few-shot prompting
- QA retrieval: Question-answer pair matching
- Multi-task Loss:
- Combined objective: L = Σ_t λ_t * L_t(θ)
- Where t indexes tasks, λ_t are task weights
- L_dense = InfoNCE loss for retrieval
- L_SRLM = Sentence representation loss
- L_LM = Language modeling loss for generation
- Batch Construction:
- Sample batches from multiple datasets simultaneously
- Ensure task diversity within each batch
- Balance high-resource and low-resource tasks
- Training Strategy:
- Task sampling: Proportional to dataset size or uniform
- Gradient accumulation across tasks
- Task-specific learning rates via parameter groups
- Evaluation Suite:
- BEIR benchmark: Zero-shot retrieval across 18 datasets
- MTEB: Massive multi-task embedding benchmark
- Task-specific metrics: MRR, Recall@k, nDCG
- In-context learning: Accuracy on downstream tasks
- Model Architecture:
- Base: LLM backbone (Llama, Mistral, etc.)
- Embedding extraction: Pooling over hidden states
- Optional: Task-specific projection heads
The key insight is that training on diverse tasks creates more generalizable representations through implicit regularization and knowledge transfer across domains.
Related Pages
- Implementation:FlagOpen_FlagEmbedding_LLM_Embedder_SRLM
- Implementation:FlagOpen_FlagEmbedding_LLM_Embedder_Retrieval_Args
- Implementation:FlagOpen_FlagEmbedding_LLM_Embedder_Retrieval_Data
- Implementation:FlagOpen_FlagEmbedding_LLM_Embedder_Retrieval_Metrics
- Implementation:FlagOpen_FlagEmbedding_LLM_Embedder_Dense_Model
- Implementation:FlagOpen_FlagEmbedding_LLM_Embedder_Utils
- Implementation:FlagOpen_FlagEmbedding_LLM_Embedder_BM25
- Implementation:FlagOpen_FlagEmbedding_LLM_Embedder_Retrieval_Trainer
- Implementation:FlagOpen_FlagEmbedding_LLM_Embedder_LM_Model
- Implementation:FlagOpen_FlagEmbedding_LLM_Embedder_EvalNQ
- Implementation:FlagOpen_FlagEmbedding_LLM_Embedder_Llama_Patch
- Implementation:FlagOpen_FlagEmbedding_LLM_Embedder_LM_Score