Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:FlagOpen FlagEmbedding LLM Reranker Instruction Modeling

From Leeroopedia


Knowledge Sources
Domains Reranking, Large_Language_Models, Instruction_Tuning
Last Updated 2026-02-09 00:00 GMT

Overview

Bi-encoder model for training instruction-tuned LLM rerankers using binary classification on the "Yes" token logit.

Description

BiEncoderModel adapts instruction-tuned LLMs for reranking:

Architecture:

  • Processes query-passage pairs formatted with instruction prompts
  • Extracts the logit for the "Yes" token at the answer position
  • Uses this single logit as the relevance score

Training:

  • Groups passages by query (1 positive + N-1 negatives)
  • Applies cross-entropy loss treating positive passage as target class 0
  • Trains the model to assign higher "Yes" probability to relevant passages

Scoring mechanism:

  • Identifies the position of the answer in the sequence (via labels)
  • Extracts logits at position-1 (the last non-label token)
  • Takes the "Yes" token logit as the relevance score
  • Higher "Yes" logit = more relevant passage

This approach leverages instruction-following capabilities of LLMs, teaching them to judge relevance through natural language ("Yes"/"No") rather than arbitrary scoring functions.

Usage

Use this for training instruction-tuned LLMs as rerankers while preserving their instruction-following abilities and using interpretable relevance judgments.

Code Reference

Source Location

Signature

class BiEncoderModel(nn.Module):
    def __init__(self, model: None, tokenizer: AutoTokenizer = None,
                 train_batch_size: int = 4)

    def encode(self, features)
    def forward(self, pair: Union[Dict[str, Tensor], List[Dict[str, Tensor]]])

Import

from research.llm_reranker.finetune_for_instruction.modeling import BiEncoderModel

I/O Contract

Inputs

Name Type Required Description
model PreTrainedModel Yes Instruction-tuned LLM (LLaMA, Mistral, etc.)
tokenizer AutoTokenizer Yes Tokenizer with "Yes" token
train_batch_size int No Batch size for grouping passages (default: 4)
pair Dict/List[Dict] Yes Tokenized inputs with input_ids, attention_mask, labels, position_ids

Outputs

Name Type Description
loss Tensor Cross-entropy loss (training only)
scores Tensor "Yes" token logits for each query-passage pair

Usage Examples

from transformers import AutoModelForCausalLM, AutoTokenizer
from research.llm_reranker.finetune_for_instruction.modeling import BiEncoderModel

# Initialize model
base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-2-7b-hf")
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-2-7b-hf")

model = BiEncoderModel(
    model=base_model,
    tokenizer=tokenizer,
    train_batch_size=4
)

# Training forward pass
# Input format: "[BOS]Query: what is AI\nPassage: AI is...\nIs relevant? Yes"
pair_inputs = {
    "input_ids": pair_ids,          # [batch_size * group_size, seq_len]
    "attention_mask": pair_mask,
    "labels": labels,               # -100 everywhere except last "Yes" token
    "position_ids": position_ids
}

outputs = model(pair=pair_inputs)
loss = outputs.loss  # Cross-entropy comparing positive vs negatives
loss.backward()

# Inference
model.eval()
with torch.no_grad():
    scores = model.encode(pair_inputs)  # [num_pairs] "Yes" token logits
    # Higher score = more relevant
    print(f"Relevance scores: {scores}")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment