Principle:Run llama Llama index Embedding Finetune Configuration

Overview

Embedding Finetune Configuration covers the design decisions and setup involved in configuring an embedding model for finetuning. This includes choosing between full Sentence Transformers finetuning and adapter-based finetuning, selecting appropriate loss functions, configuring training hyperparameters, and preparing the training infrastructure.

Concept: Sentence Transformers Finetuning vs Adapter-Based Finetuning

LlamaIndex supports two distinct approaches to embedding finetuning:

Approach	Description	When to Use
Sentence Transformers (Full)	Finetunes all parameters of a Sentence Transformer model on domain-specific data	When you have sufficient training data and want maximum performance improvement
Adapter-Based	Freezes the base embedding model and trains a lightweight adapter layer on top	When you want to preserve the base model's general capabilities while adding domain specialization

Full Finetuning

Full finetuning modifies all model weights. The SentenceTransformersFinetuneEngine loads a pretrained Sentence Transformer model (e.g., BAAI/bge-small-en) and trains it end-to-end on query-document pairs. This approach:

Provides the most flexibility for domain adaptation
Requires more training data for good generalization
Produces a self-contained model that can be loaded directly

Adapter-Based Finetuning

Adapter finetuning adds a small trainable layer (typically a linear transformation) on top of frozen base embeddings. The EmbeddingAdapterFinetuneEngine embeds all queries and documents using the base model first, then trains the adapter to transform these embeddings for better retrieval. This approach:

Requires less training data
Preserves the base model's general-purpose capabilities
Results in a smaller additional model artifact (just the adapter weights)

Concept: Loss Functions for Embedding Training

The choice of loss function is critical for embedding finetuning quality:

MultipleNegativesRankingLoss (Default)

This is the default loss function used by SentenceTransformersFinetuneEngine. It implements a form of InfoNCE (Information Noise-Contrastive Estimation) loss:

Given a batch of (query, positive_document) pairs, it treats all other documents in the batch as negatives
The loss encourages the model to rank the positive document higher than all in-batch negatives
No explicit negative mining is needed -- negatives come "for free" from the batch

The mathematical formulation is:

L = -log( exp(sim(q, d+)) / sum_i(exp(sim(q, d_i))) )

where sim is cosine similarity, d+ is the positive document, and d_i iterates over all documents in the batch.

Custom Loss Functions

Users can provide any Sentence Transformers-compatible loss function via the loss parameter. Common alternatives include:

CosineSimilarityLoss -- For when you have explicit similarity scores
TripletLoss -- When you have explicit (anchor, positive, negative) triplets
ContrastiveLoss -- For binary similar/dissimilar pairs

Concept: Key Hyperparameters

Hyperparameter	Default	Impact
model_id	`"BAAI/bge-small-en"`	The pretrained model to start from. Larger models generally perform better but require more resources.
batch_size	`10`	Larger batches provide more in-batch negatives for MultipleNegativesRankingLoss, potentially improving quality.
epochs	`2`	Number of passes through the training data. Too many epochs can lead to overfitting on small datasets.
evaluation_steps	`50`	How often to evaluate on the validation set during training.
use_all_docs	`False`	If True, creates training pairs for all relevant documents per query (not just the first).

Concept: Warmup Steps

Warmup steps are automatically calculated as 10% of total training steps:

warmup_steps = int(len(data_loader) * epochs * 0.1)

During warmup, the learning rate gradually increases from zero to the target learning rate. This prevents destabilizing the pretrained weights with large initial gradient updates.

Concept: Validation and Evaluation

When a val_dataset is provided, an InformationRetrievalEvaluator is created to measure retrieval quality during training. This evaluator:

Uses the validation queries, corpus, and relevance judgments
Computes standard information retrieval metrics (e.g., MRR, NDCG, MAP)
Runs at intervals defined by evaluation_steps

This allows monitoring for overfitting and selecting the best checkpoint.

Concept: Checkpoint Management

For long-running finetuning jobs, checkpoint support enables:

save_checkpoints -- Enable/disable checkpoint saving
checkpoint_save_steps -- Save a checkpoint every N training steps
checkpoint_save_total_limit -- Maximum number of checkpoints to keep (0 = unlimited)
resume_from_checkpoint -- Resume training from the latest checkpoint

Knowledge Sources

LlamaIndex Embedding Finetuning Guide Sentence Transformers Training Overview MultipleNegativesRankingLoss

Metadata

Machine Learning Embeddings Finetuning Contrastive Learning LlamaIndex

Implementation:Run_llama_Llama_index_SentenceTransformersFinetuneEngine_Init Heuristic:Run_llama_Llama_index_Finetuning_Warmup_Steps

2026-02-11 00:00 GMT

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment