Implementation:FlagOpen FlagEmbedding AbsEmbedderTrainingArguments

Overview

API documentation for the embedder training argument dataclasses defined in FlagEmbedding/abc/finetune/embedder/AbsArguments.py.

AbsEmbedderTrainingArguments

@dataclass
class AbsEmbedderTrainingArguments(TrainingArguments):
    negatives_cross_device: bool = False
    temperature: Optional[float] = 0.02
    fix_position_embedding: bool = False
    sentence_pooling_method: str = 'cls'  # cls, mean, last_token
    normalize_embeddings: bool = True
    sub_batch_size: Optional[int] = None
    kd_loss_type: str = 'kl_div'  # kl_div, m3_kd_loss

Import

from FlagEmbedding.abc.finetune.embedder.AbsArguments import (
    AbsEmbedderModelArguments,
    AbsEmbedderDataArguments,
    AbsEmbedderTrainingArguments,
)

AbsEmbedderModelArguments

model_name_or_path: str - Name or path of the pretrained model.
config_name: str - Name or path of the model config.
tokenizer_name: str - Name or path of the tokenizer.
cache_dir: str - Cache directory for downloaded models.
trust_remote_code: bool - Whether to trust remote code.
token: str - Authentication token for model access.

AbsEmbedderDataArguments

train_data: str - Path to training data.
train_group_size: int - Number of passages per query group. Default: 8.
query_max_len: int - Maximum query token length. Default: 32.
passage_max_len: int - Maximum passage token length. Default: 128.
knowledge_distillation: bool - Whether to use knowledge distillation scores.
same_dataset_within_batch: bool - Whether to ensure all examples in a batch come from the same dataset.
query_instruction_for_retrieval: str - Instruction prepended to queries during retrieval.
query_instruction_format: str - Format string for query instruction.
passage_instruction_for_retrieval: str - Instruction prepended to passages during retrieval.
shuffle_ratio: float - Ratio for shuffling training data.

I/O

Input: CLI arguments or a dictionary of parameter values.
Output: Configured dataclass instances for model, data, and training arguments.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment