Implementation:ContextualAI HALOs AutoModelForBradleyTerry From Pretrained
| Knowledge Sources | |
|---|---|
| Domains | Deep_Learning, NLP, Reinforcement_Learning |
| Last Updated | 2026-02-08 03:00 GMT |
Overview
Concrete tool for initializing a binary classification reward model provided by the AutoModelForBradleyTerry wrapper class.
Description
AutoModelForBradleyTerry is a wrapper around HuggingFace's AutoModelForSequenceClassification that enforces num_labels=2 for binary classification. It overrides from_pretrained() to force the binary classification head regardless of the config, and save_pretrained() to maintain this configuration when saving. It also ensures the padding token is correctly configured.
Usage
Used internally by BradleyTerryTrainer as the policy_hf_model_class. The model is loaded via Hydra config with loss=bradley-terry.
Code Reference
Source Location
- Repository: ContextualAI/HALOs
- File: train/models.py
- Lines: L553-618
Signature
class AutoModelForBradleyTerry(AutoModelForSequenceClassification):
"""Wrapper ensuring binary classification (num_labels=2)."""
@classmethod
def from_pretrained(
cls,
pretrained_model_name_or_path: Union[str, PreTrainedModel],
*model_args,
**kwargs
) -> PreTrainedModel:
"""Load pretrained model with forced num_labels=2.
Args:
pretrained_model_name_or_path: HuggingFace model ID or local path
*model_args: Additional positional args for __init__
**kwargs: Additional keyword args (num_labels forced to 2)
Returns:
PreTrainedModel with binary classification head
"""
def save_pretrained(
self,
save_directory: str,
is_main_process: bool = True,
state_dict: Optional[dict] = None,
save_function: callable = torch.save,
**kwargs
):
"""Save with num_labels=2 and pad_token_id preserved."""
Import
from train.models import AutoModelForBradleyTerry
model = AutoModelForBradleyTerry.from_pretrained(
"meta-llama/Meta-Llama-3-8B",
torch_dtype=torch.bfloat16,
)
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| pretrained_model_name_or_path | str | Yes | HuggingFace model ID or local checkpoint path |
| torch_dtype | torch.dtype | No | Model precision (e.g., torch.bfloat16) |
| attn_implementation | str | No | Attention implementation ('flash_attention_2', 'eager') |
Outputs
| Name | Type | Description |
|---|---|---|
| model | PreTrainedModel | Sequence classification model with 2-label head |
| model.config.num_labels | int | Always 2 |
| model.config.pad_token_id | int | Set to eos_token_id if not configured |
Usage Examples
Loading a Reward Model
from train.models import AutoModelForBradleyTerry
import torch
# Initialize Bradley-Terry model from a pre-trained LLM
model = AutoModelForBradleyTerry.from_pretrained(
"meta-llama/Meta-Llama-3-8B",
torch_dtype=torch.bfloat16,
attn_implementation="flash_attention_2",
)
# The model now has a 2-class classification head
print(model.config.num_labels) # 2
Loading from a Trained Checkpoint
# Load a previously trained reward model
model = AutoModelForBradleyTerry.from_pretrained(
"/models/llama3-8B-bt/FINAL",
torch_dtype=torch.bfloat16,
)