Principle:Predibase Lorax Chat Template Rendering
| Knowledge Sources | |
|---|---|
| Domains | NLP, Prompt_Engineering |
| Last Updated | 2026-02-08 02:00 GMT |
Overview
A prompt construction mechanism that converts structured chat messages (system/user/assistant turns) into a flat token string using model-specific Jinja2 chat templates.
Description
Chat Template Rendering bridges the gap between the high-level OpenAI message format and the raw text input that language models consume. Different models use different prompt formats (e.g., Llama uses [INST]...[/INST], Mistral uses [INST]...[/INST], ChatML uses <|im_start|>...<|im_end|>).
The rendering process:
- Parse structured messages (role + content pairs)
- Load the model's Jinja2 chat template from the tokenizer config
- Render messages through the template engine (minijinja in Rust)
- Optionally inject tool definitions and system guidelines
Usage
Applied automatically when using the /v1/chat/completions endpoint. The template is loaded from the model's tokenizer_config.json and applied to each request's message array.
Theoretical Basis
Pseudo-code:
# Chat template rendering
template = load_jinja_template(tokenizer_config)
messages = [
{"role": "system", "content": "You are helpful."},
{"role": "user", "content": "Hello"},
]
prompt = template.render(messages=messages, add_generation_prompt=True)
# Result: "<s>[INST] You are helpful.\n\nHello [/INST]"