Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Huggingface Alignment handbook Setup Chat Format

From Leeroopedia


Knowledge Sources
Domains NLP, Preprocessing
Last Updated 2026-02-07 00:00 GMT

Overview

Concrete tool for applying ChatML format to models and tokenizers that lack a chat template, provided by the TRL library.

Description

setup_chat_format is a TRL utility that configures a model-tokenizer pair with ChatML-style special tokens and chat template. It adds special tokens (<|im_start|>, <|im_end|>) to the tokenizer, resizes the model's token embeddings to accommodate the new tokens, and sets the Jinja2 chat template on the tokenizer.

In the alignment-handbook, this function is called as a fallback in the SFT script when the loaded tokenizer has no chat template (tokenizer.chat_template is None).

Usage

Use this when loading a base model that does not have a chat template defined. Most instruction-tuned models already have templates, but base models (e.g., "mistralai/Mistral-7B-v0.1") typically do not.

Code Reference

Source Location

  • Repository: alignment-handbook
  • File: scripts/sft.py (lines 98-100)
  • Definition: External TRL library

Signature

def setup_chat_format(
    model: AutoModelForCausalLM,
    tokenizer: PreTrainedTokenizer,
    format: str = "chatml",
    resize_to_multiple_of: Optional[int] = None,
) -> tuple[AutoModelForCausalLM, PreTrainedTokenizer]:
    """Set up chat format for model and tokenizer.

    Args:
        model: The model to configure.
        tokenizer: The tokenizer to configure.
        format: Chat format to use (default: "chatml").
        resize_to_multiple_of: Resize embeddings to multiple of this value.

    Returns:
        Tuple of (model, tokenizer) with chat format applied.
    """

Import

from trl import setup_chat_format

I/O Contract

Inputs

Name Type Required Description
model AutoModelForCausalLM Yes Model to add chat tokens to (embeddings will be resized)
tokenizer PreTrainedTokenizer Yes Tokenizer to add chat template and special tokens to
format str No Chat format name (default: "chatml")

Outputs

Name Type Description
return tuple[AutoModelForCausalLM, PreTrainedTokenizer] Model with resized embeddings and tokenizer with ChatML template and special tokens

Usage Examples

ChatML Fallback in SFT Script

from alignment import get_model, get_tokenizer
from trl import setup_chat_format

# Load model and tokenizer
tokenizer = get_tokenizer(model_args, training_args)
model = get_model(model_args, training_args)

# Apply ChatML if no template exists
if tokenizer.chat_template is None:
    logger.info("No chat template provided, using ChatML.")
    model, tokenizer = setup_chat_format(model, tokenizer, format="chatml")

# Now tokenizer has ChatML template:
# <|im_start|>user\n{content}<|im_end|>\n<|im_start|>assistant\n{content}<|im_end|>

Related Pages

Implements Principle

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment