Implementation:Run llama Llama index CrossEncoderFinetuneEngine
Overview
The CrossEncoderFinetuneEngine provides fine-tuning capabilities for cross-encoder models using the sentence-transformers library. Cross-encoders are used for reranking tasks, where query-document pairs are jointly encoded to produce a relevance score. This module resides in the llama-index-finetuning package under the cross_encoders submodule.
Source file: llama-index-finetuning/llama_index/finetuning/cross_encoders/cross_encoder.py (131 lines)
Dependencies
| Dependency | Purpose |
|---|---|
sentence_transformers.InputExample |
Data container for training pairs (imported at runtime) |
sentence_transformers.cross_encoder.CrossEncoder |
The underlying cross-encoder model class (imported at runtime) |
sentence_transformers.cross_encoder.evaluation.CEBinaryClassificationEvaluator |
Evaluator for binary classification during training (imported at runtime) |
torch.utils.data.DataLoader |
Batched data loading for training (imported at runtime) |
llama_index.core.postprocessor.SentenceTransformerRerank |
LlamaIndex reranker postprocessor returned by get_finetuned_model
|
llama_index.finetuning.cross_encoders.dataset_gen.CrossEncoderFinetuningDatasetSample |
Data class for training and validation samples |
llama_index.finetuning.types.BaseCrossEncoderFinetuningEngine |
Abstract base class defining the fine-tuning engine interface |
All sentence_transformers and torch dependencies are imported inside the constructor to provide a clear error message if they are not installed.
Class: CrossEncoderFinetuneEngine
Inherits from: BaseCrossEncoderFinetuningEngine
Constructor
def __init__(
self,
dataset: List[CrossEncoderFinetuningDatasetSample],
model_id: str = "cross-encoder/ms-marco-MiniLM-L-12-v2",
model_output_path: str = "exp_finetune",
batch_size: int = 10,
val_dataset: Union[List[CrossEncoderFinetuningDatasetSample], None] = None,
loss: Union[Any, None] = None,
epochs: int = 2,
show_progress_bar: bool = True,
evaluation_steps: int = 50,
) -> None
| Parameter | Type | Default | Description |
|---|---|---|---|
dataset |
List[CrossEncoderFinetuningDatasetSample] |
required | List of training samples containing query, context, and score |
model_id |
str |
"cross-encoder/ms-marco-MiniLM-L-12-v2" |
HuggingFace model identifier for the cross-encoder |
model_output_path |
str |
"exp_finetune" |
Directory path where the fine-tuned model will be saved |
batch_size |
int |
10 |
Batch size for the training DataLoader |
val_dataset |
Union[List[...], None] |
None |
Optional validation dataset for evaluation during training |
loss |
Union[Any, None] |
None |
Custom loss function (stored but not directly used in the fit call) |
epochs |
int |
2 |
Number of training epochs |
show_progress_bar |
bool |
True |
Whether to display a progress bar during training |
evaluation_steps |
int |
50 |
Number of steps between evaluation runs (if evaluator is set) |
Initialization behavior:
- Attempts to import
sentence_transformersandtorch; raisesImportErrorif unavailable. - Initializes a
CrossEncodermodel withnum_labels=1(regression-style scoring). - Converts each
CrossEncoderFinetuningDatasetSampleinto asentence_transformers.InputExamplewithtexts=[query, context]andlabel=score. - Creates a
DataLoaderfrom the converted examples. - If a validation dataset is provided, creates a
CEBinaryClassificationEvaluatorfrom the validation samples. - Computes warmup steps as 10% of total training steps:
int(len(loader) * epochs * 0.1).
Method: finetune
def finetune(self, **train_kwargs: Any) -> None
Executes the fine-tuning process by calling the cross-encoder's fit method.
Workflow:
- Calls
self.model.fit()with the configured DataLoader, epochs, warmup steps, output path, progress bar setting, evaluator, and evaluation steps. - If the evaluator is
None, explicitly callsself.model.save(self.model_output_path). This is necessary because thesentence-transformerslibrary'sfitmethod does not automatically save the model when no evaluator is provided (see issue #2324).
Note: The **train_kwargs parameter is accepted but not forwarded to the fit call.
Method: push_to_hub
def push_to_hub(self, repo_id: Any = None) -> None
Saves the model and tokenizer to the HuggingFace Hub.
| Parameter | Type | Description |
|---|---|---|
repo_id |
Any |
HuggingFace Hub repository identifier (e.g., "username/model-name")
|
Workflow:
- If
repo_idisNone, raisesValueError. - Pushes the model via
self.model.model.push_to_hub(repo_id=repo_id). - Pushes the tokenizer via
self.model.tokenizer.push_to_hub(repo_id=repo_id). - If a
ValueErroris raised (e.g., due to missing HuggingFace credentials), re-raises with a descriptive message about HuggingFace CLI login.
Method: get_finetuned_model
def get_finetuned_model(
self, model_name: str, top_n: int = 3
) -> SentenceTransformerRerank
Returns a LlamaIndex SentenceTransformerRerank postprocessor loaded from a specified model.
| Parameter | Type | Default | Description |
|---|---|---|---|
model_name |
str |
required | Model identifier or path to load (can be a HuggingFace Hub repo ID or local path) |
top_n |
int |
3 |
Number of top-ranked nodes the reranker will return |
Data Flow
CrossEncoderFinetuningDatasetSample (query, context, score)
|
v
InputExample (texts=[query, context], label=score)
|
v
DataLoader (batched examples)
|
v
CrossEncoder.fit() --> Saved model on disk
|
v
SentenceTransformerRerank (LlamaIndex postprocessor)
See Also
- Run_llama_Llama_index_CrossEncoder_Dataset_Gen -- Dataset generation for cross-encoder fine-tuning
- Run_llama_Llama_index_CohereRerankerFinetuneEngine -- Cohere-based reranker fine-tuning