Implementation:Unslothai Unsloth Get Chat Template
| Knowledge Sources | |
|---|---|
| Domains | NLP, Data_Preprocessing, Tokenization |
| Last Updated | 2026-02-07 00:00 GMT |
Overview
Concrete tool for applying chat templates to HuggingFace tokenizers provided by the Unsloth library.
Description
The get_chat_template function configures a HuggingFace tokenizer with the appropriate Jinja2 chat template for a given model family. It supports 40+ model families (Llama 3, Mistral, ChatML, Gemma, Qwen, Phi, etc.) via the internal CHAT_TEMPLATES registry. The function also handles EOS token mapping, system message injection, and role/content field remapping for ShareGPT-format datasets.
Usage
Import this function immediately after loading a tokenizer and before formatting training data. Use it when fine-tuning any chat or instruction model to ensure the tokenizer applies the correct conversational template.
Code Reference
Source Location
- Repository: unsloth
- File: unsloth/chat_templates.py
- Lines: L2123-2350
Signature
def get_chat_template(
tokenizer,
chat_template = "chatml",
mapping = {"role": "role", "content": "content", "user": "user", "assistant": "assistant"},
map_eos_token = True,
system_message = None,
):
"""
Applies a chat template to the tokenizer and configures EOS token mapping.
Args:
tokenizer: HuggingFace PreTrainedTokenizer to configure.
chat_template (str): Template name (e.g., "chatml", "llama-3", "mistral",
"gemma", "qwen-2.5", "phi-4") or a custom template tuple.
mapping (dict): Role/content field mapping for ShareGPT format datasets.
Keys: "role", "content", "user", "assistant".
map_eos_token (bool): Whether to map the EOS token to the template's
stop token. Default True.
system_message (str): Optional system prompt to inject into the template.
Returns:
Modified tokenizer with Jinja2 chat_template set and EOS token configured.
"""
Import
from unsloth.chat_templates import get_chat_template
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| tokenizer | PreTrainedTokenizer | Yes | HuggingFace tokenizer to configure |
| chat_template | str | No | Template name from CHAT_TEMPLATES registry (default: "chatml") |
| mapping | dict | No | Role/content field mapping for ShareGPT datasets |
| map_eos_token | bool | No | Map EOS token to template stop word (default: True) |
| system_message | str | No | Optional system prompt injected into template |
Outputs
| Name | Type | Description |
|---|---|---|
| tokenizer | PreTrainedTokenizer | Modified tokenizer with chat_template Jinja2 string and EOS token configured |
Usage Examples
Basic Chat Template Application
from unsloth import FastLanguageModel
from unsloth.chat_templates import get_chat_template
# Load model and tokenizer
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="unsloth/Llama-3.2-1B-Instruct",
max_seq_length=2048,
load_in_4bit=True,
)
# Apply Llama 3 chat template
tokenizer = get_chat_template(
tokenizer,
chat_template="llama-3",
)
# Format a conversation
messages = [
{"role": "user", "content": "What is machine learning?"},
{"role": "assistant", "content": "Machine learning is a branch of AI..."},
]
formatted = tokenizer.apply_chat_template(messages, tokenize=False)
from unsloth.chat_templates import get_chat_template
tokenizer = get_chat_template(
tokenizer,
chat_template="chatml",
mapping={"role": "from", "content": "value", "user": "human", "assistant": "gpt"},
system_message="You are a helpful assistant.",
)