Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Heuristic:Huggingface Alignment handbook EOS Token Alignment

From Leeroopedia




Knowledge Sources
Domains LLMs, Debugging
Last Updated 2026-02-07 00:00 GMT

Overview

After training, align the model's generation_config.eos_token_id with the tokenizer's eos_token_id to prevent unbounded generation in inference pipelines.

Description

The alignment-handbook's SFT script explicitly sets the model's generation config EOS token to match the tokenizer after training. This is critical because some models (especially those with custom chat templates) may have a mismatch between the model's default EOS token and the tokenizer's EOS token. Without this alignment, the model may fail to stop generating in HuggingFace's pipeline() function.

Usage

Apply this after every SFT training run before saving the model. This is already handled automatically in the alignment-handbook's sft.py script, but must be considered when building custom training pipelines or modifying the save logic.

The Insight (Rule of Thumb)

  • Action: After training, explicitly set `model.generation_config.eos_token_id = tokenizer.eos_token_id` and `model.config.eos_token_id = tokenizer.eos_token_id` before saving.
  • Value: Prevents infinite generation loops in inference.
  • Trade-off: None. This is a zero-cost fix that prevents a critical inference bug.

Reasoning

Chat-tuned models often use custom EOS tokens (e.g., <|im_end|> for ChatML, for Mistral). If the model's generation_config still points to the base model's EOS token, the pipeline will not stop generation at the correct point.

Code evidence from `scripts/sft.py:134-138`:

    # Align the model's generation config with the tokenizer's eos token
    # to avoid unbounded generation in the transformers `pipeline()` function
    trainer.model.generation_config.eos_token_id = tokenizer.eos_token_id
    trainer.model.config.eos_token_id = tokenizer.eos_token_id
    trainer.save_model(training_args.output_dir)

The comment in the code explicitly documents the motivation: avoid unbounded generation in the transformers pipeline() function.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment