Heuristic:Princeton nlp SimPO Left Truncation Strategy
| Knowledge Sources | |
|---|---|
| Domains | NLP, Tokenization, Preference_Optimization |
| Last Updated | 2026-02-08 05:00 GMT |
Overview
Use left-side truncation for prompts in preference optimization to preserve response labels in the final turn, preventing label loss during training.
Description
In preference optimization tasks (SimPO, DPO, etc.), the training signal comes from the response portion (chosen/rejected), not the prompt. When the combined prompt+response exceeds the maximum sequence length, truncating from the right (default) would remove response tokens and destroy the training signal. The SimPO codebase explicitly sets `truncation_side = "left"` and uses a two-stage truncation strategy: first truncate the prompt (keeping either start or end based on truncation_mode), then truncate the response only if still too long.
Usage
Use this heuristic when configuring tokenizer settings for any preference optimization training. Apply it whenever you see sequence length warnings or need to handle long prompts.
The Insight (Rule of Thumb)
- Action: Set `data_args.truncation_side = "left"` before initializing the tokenizer.
- Action: Set `truncation_mode = "keep_end"` (default) in SimPOConfig to keep the most recent context of multi-turn prompts.
- Value: `max_prompt_length: 1800` and `max_length: 2048` in training configs, ensuring at least 248 tokens for responses.
- Trade-off: Early conversation context may be lost for very long prompts, but the response quality signal is preserved.
Reasoning
Preference optimization labels are attached to the response tokens (the last turn). If truncation removes these tokens, the model receives no training signal. Left truncation ensures that when prompts are too long, the beginning of the prompt is removed rather than the end where the response is located. The two-stage truncation (prompt first, then response) maximizes the information retained from both prompt and response.
Code evidence from `scripts/run_simpo.py:173`:
data_args.truncation_side = "left" # Truncate from left to ensure we don't lose labels in final turn
Code evidence from `scripts/simpo_trainer.py:429-444`:
# if combined sequence is too long, truncate the prompt
for answer_tokens in [chosen_tokens, rejected_tokens, prompt_tokens]:
if len(answer_tokens["prompt_input_ids"]) + longer_response_length > self.max_length:
if self.truncation_mode == "keep_start":
for k in ["prompt_input_ids", "prompt_attention_mask"]:
answer_tokens[k] = answer_tokens[k][: self.max_prompt_length]
elif self.truncation_mode == "keep_end":
for k in ["prompt_input_ids", "prompt_attention_mask"]:
answer_tokens[k] = answer_tokens[k][-self.max_prompt_length :]
# if that's still too long, truncate the response
for answer_tokens in [chosen_tokens, rejected_tokens]:
if len(answer_tokens["prompt_input_ids"]) + longer_response_length > self.max_length:
for k in ["input_ids", "attention_mask"]:
answer_tokens[k] = answer_tokens[k][: self.max_length - self.max_prompt_length]