Implementation:Run llama Llama index CrossEncoderFinetuneEngine

Overview

The CrossEncoderFinetuneEngine provides fine-tuning capabilities for cross-encoder models using the sentence-transformers library. Cross-encoders are used for reranking tasks, where query-document pairs are jointly encoded to produce a relevance score. This module resides in the llama-index-finetuning package under the cross_encoders submodule.

Source file: llama-index-finetuning/llama_index/finetuning/cross_encoders/cross_encoder.py (131 lines)

Dependencies

Dependency	Purpose
`sentence_transformers.InputExample`	Data container for training pairs (imported at runtime)
`sentence_transformers.cross_encoder.CrossEncoder`	The underlying cross-encoder model class (imported at runtime)
`sentence_transformers.cross_encoder.evaluation.CEBinaryClassificationEvaluator`	Evaluator for binary classification during training (imported at runtime)
`torch.utils.data.DataLoader`	Batched data loading for training (imported at runtime)
`llama_index.core.postprocessor.SentenceTransformerRerank`	LlamaIndex reranker postprocessor returned by `get_finetuned_model`
`llama_index.finetuning.cross_encoders.dataset_gen.CrossEncoderFinetuningDatasetSample`	Data class for training and validation samples
`llama_index.finetuning.types.BaseCrossEncoderFinetuningEngine`	Abstract base class defining the fine-tuning engine interface

All sentence_transformers and torch dependencies are imported inside the constructor to provide a clear error message if they are not installed.

Class: CrossEncoderFinetuneEngine

Inherits from: BaseCrossEncoderFinetuningEngine

Constructor

def __init__(
    self,
    dataset: List[CrossEncoderFinetuningDatasetSample],
    model_id: str = "cross-encoder/ms-marco-MiniLM-L-12-v2",
    model_output_path: str = "exp_finetune",
    batch_size: int = 10,
    val_dataset: Union[List[CrossEncoderFinetuningDatasetSample], None] = None,
    loss: Union[Any, None] = None,
    epochs: int = 2,
    show_progress_bar: bool = True,
    evaluation_steps: int = 50,
) -> None

Parameter	Type	Default	Description
`dataset`	`List[CrossEncoderFinetuningDatasetSample]`	required	List of training samples containing query, context, and score
`model_id`	`str`	`"cross-encoder/ms-marco-MiniLM-L-12-v2"`	HuggingFace model identifier for the cross-encoder
`model_output_path`	`str`	`"exp_finetune"`	Directory path where the fine-tuned model will be saved
`batch_size`	`int`	`10`	Batch size for the training DataLoader
`val_dataset`	`Union[List[...], None]`	`None`	Optional validation dataset for evaluation during training
`loss`	`Union[Any, None]`	`None`	Custom loss function (stored but not directly used in the fit call)
`epochs`	`int`	`2`	Number of training epochs
`show_progress_bar`	`bool`	`True`	Whether to display a progress bar during training
`evaluation_steps`	`int`	`50`	Number of steps between evaluation runs (if evaluator is set)

Initialization behavior:

Attempts to import sentence_transformers and torch; raises ImportError if unavailable.
Initializes a CrossEncoder model with num_labels=1 (regression-style scoring).
Converts each CrossEncoderFinetuningDatasetSample into a sentence_transformers.InputExample with texts=[query, context] and label=score.
Creates a DataLoader from the converted examples.
If a validation dataset is provided, creates a CEBinaryClassificationEvaluator from the validation samples.
Computes warmup steps as 10% of total training steps: int(len(loader) * epochs * 0.1).

Method: finetune

def finetune(self, **train_kwargs: Any) -> None

Executes the fine-tuning process by calling the cross-encoder's fit method.

Workflow:

Calls self.model.fit() with the configured DataLoader, epochs, warmup steps, output path, progress bar setting, evaluator, and evaluation steps.
If the evaluator is None, explicitly calls self.model.save(self.model_output_path). This is necessary because the sentence-transformers library's fit method does not automatically save the model when no evaluator is provided (see issue #2324).

Note: The **train_kwargs parameter is accepted but not forwarded to the fit call.

Method: push_to_hub

def push_to_hub(self, repo_id: Any = None) -> None

Saves the model and tokenizer to the HuggingFace Hub.

Parameter	Type	Description
`repo_id`	`Any`	HuggingFace Hub repository identifier (e.g., `"username/model-name"`)

Workflow:

If repo_id is None, raises ValueError.
Pushes the model via self.model.model.push_to_hub(repo_id=repo_id).
Pushes the tokenizer via self.model.tokenizer.push_to_hub(repo_id=repo_id).
If a ValueError is raised (e.g., due to missing HuggingFace credentials), re-raises with a descriptive message about HuggingFace CLI login.

Method: get_finetuned_model

def get_finetuned_model(
    self, model_name: str, top_n: int = 3
) -> SentenceTransformerRerank

Returns a LlamaIndex SentenceTransformerRerank postprocessor loaded from a specified model.

Parameter	Type	Default	Description
`model_name`	`str`	required	Model identifier or path to load (can be a HuggingFace Hub repo ID or local path)
`top_n`	`int`	`3`	Number of top-ranked nodes the reranker will return

Data Flow

CrossEncoderFinetuningDatasetSample (query, context, score)
    |
    v
InputExample (texts=[query, context], label=score)
    |
    v
DataLoader (batched examples)
    |
    v
CrossEncoder.fit() --> Saved model on disk
    |
    v
SentenceTransformerRerank (LlamaIndex postprocessor)

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment