Principle:Princeton nlp SimPO Chat Template Application
| Knowledge Sources | |
|---|---|
| Domains | NLP, Data_Preprocessing |
| Last Updated | 2026-02-08 04:30 GMT |
Overview
A text formatting step that converts structured message lists into model-specific prompt strings for preference optimization training.
Description
Chat template application transforms OpenAI-format message lists (with role and content keys) into the specific text format expected by each model's tokenizer. For SimPO training, this step must handle three distinct text segments: the prompt (all turns except the last), the chosen response (preferred final turn), and the rejected response (dispreferred final turn). A critical detail in SimPO's template application is BOS token stripping — the beginning-of-sequence token is removed from chosen and rejected responses to prevent double-BOS when the prompt already ends with one. The system also supports model-specific templates (e.g., Mistral's custom template) and optional empty system message insertion for models whose templates expect a system turn.
Usage
Use this principle immediately after dataset loading and before feeding data to the SimPO trainer. The chat template must match the target model's expected format. This is a necessary preprocessing step for any preference optimization pipeline.
Theoretical Basis
Chat templates follow the structural formatting principle for instruction-tuned models:
- Message decomposition — Split the conversation into prompt (N-1 turns) and response (final turn)
- Template application — Apply the model's Jinja2 chat template to convert messages to text
- BOS deduplication — Strip leading BOS tokens from response text to avoid double-BOS artifacts
- System message insertion — Optionally prepend an empty system message if the template expects one
Pseudo-code:
# Abstract algorithm (NOT real implementation)
prompt_messages = conversation[:-1] # All turns except last
chosen_message = chosen[-1:] # Last turn only
rejected_message = rejected[-1:] # Last turn only
text_prompt = tokenizer.apply_chat_template(prompt_messages)
text_chosen = tokenizer.apply_chat_template(chosen_message)
text_rejected = tokenizer.apply_chat_template(rejected_message)
# Strip BOS from responses (SimPO-specific)
if text_chosen.startswith(BOS):
text_chosen = text_chosen[len(BOS):]
if text_rejected.startswith(BOS):
text_rejected = text_rejected[len(BOS):]