Implementation:Huggingface Alignment handbook Setup Chat Format
| Knowledge Sources | |
|---|---|
| Domains | NLP, Preprocessing |
| Last Updated | 2026-02-07 00:00 GMT |
Overview
Concrete tool for applying ChatML format to models and tokenizers that lack a chat template, provided by the TRL library.
Description
setup_chat_format is a TRL utility that configures a model-tokenizer pair with ChatML-style special tokens and chat template. It adds special tokens (<|im_start|>, <|im_end|>) to the tokenizer, resizes the model's token embeddings to accommodate the new tokens, and sets the Jinja2 chat template on the tokenizer.
In the alignment-handbook, this function is called as a fallback in the SFT script when the loaded tokenizer has no chat template (tokenizer.chat_template is None).
Usage
Use this when loading a base model that does not have a chat template defined. Most instruction-tuned models already have templates, but base models (e.g., "mistralai/Mistral-7B-v0.1") typically do not.
Code Reference
Source Location
- Repository: alignment-handbook
- File: scripts/sft.py (lines 98-100)
- Definition: External TRL library
Signature
def setup_chat_format(
model: AutoModelForCausalLM,
tokenizer: PreTrainedTokenizer,
format: str = "chatml",
resize_to_multiple_of: Optional[int] = None,
) -> tuple[AutoModelForCausalLM, PreTrainedTokenizer]:
"""Set up chat format for model and tokenizer.
Args:
model: The model to configure.
tokenizer: The tokenizer to configure.
format: Chat format to use (default: "chatml").
resize_to_multiple_of: Resize embeddings to multiple of this value.
Returns:
Tuple of (model, tokenizer) with chat format applied.
"""
Import
from trl import setup_chat_format
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| model | AutoModelForCausalLM | Yes | Model to add chat tokens to (embeddings will be resized) |
| tokenizer | PreTrainedTokenizer | Yes | Tokenizer to add chat template and special tokens to |
| format | str | No | Chat format name (default: "chatml") |
Outputs
| Name | Type | Description |
|---|---|---|
| return | tuple[AutoModelForCausalLM, PreTrainedTokenizer] | Model with resized embeddings and tokenizer with ChatML template and special tokens |
Usage Examples
ChatML Fallback in SFT Script
from alignment import get_model, get_tokenizer
from trl import setup_chat_format
# Load model and tokenizer
tokenizer = get_tokenizer(model_args, training_args)
model = get_model(model_args, training_args)
# Apply ChatML if no template exists
if tokenizer.chat_template is None:
logger.info("No chat template provided, using ChatML.")
model, tokenizer = setup_chat_format(model, tokenizer, format="chatml")
# Now tokenizer has ChatML template:
# <|im_start|>user\n{content}<|im_end|>\n<|im_start|>assistant\n{content}<|im_end|>