Implementation:Ollama Ollama Llama Chat

Knowledge Sources	Ollama
Domains	Chat Templates, Prompt Formatting
Last Updated	2025-02-15 00:00 GMT

Overview

Implements chat template detection and message formatting for dozens of LLM chat template formats, enabling correct multi-turn conversation formatting.

Description

Contains a LLM_CHAT_TEMPLATES map from template name strings to enum values, and a llm_chat_detect_template function that auto-detects the template from a model's metadata string by searching for characteristic tokens (e.g., <|im_start|> for ChatML, [INST] for Mistral/LLaMA 2). The main llm_chat_apply_template function takes an array of chat messages (role + content) and formats them according to the detected template. Supports ChatML, LLaMA 2/3/4, Mistral v1/v3/v7, Phi-3/4, Gemma, DeepSeek 2/3, Command-R, Zephyr, Vicuna, RWKV, Granite, and many more formats.

Usage

Use this to correctly format multi-turn conversation prompts for any supported model. Each model family expects a specific prompt format, and using the wrong template degrades model performance.

Code Reference

Source Location

Repository: Ollama
File: llama/llama.cpp/src/llama-chat.cpp
Lines: 1-865

Signature

llm_chat_template llm_chat_template_from_str(const std::string & name);
llm_chat_template llm_chat_detect_template(const std::string & tmpl);

static const std::map<std::string, llm_chat_template> LLM_CHAT_TEMPLATES = {
    { "chatml",    LLM_CHAT_TEMPLATE_CHATML    },
    { "llama2",    LLM_CHAT_TEMPLATE_LLAMA_2   },
    { "llama3",    LLM_CHAT_TEMPLATE_LLAMA_3   },
    { "mistral-v3",LLM_CHAT_TEMPLATE_MISTRAL_V3},
    { "phi3",      LLM_CHAT_TEMPLATE_PHI_3     },
    { "gemma",     LLM_CHAT_TEMPLATE_GEMMA     },
    { "deepseek3", LLM_CHAT_TEMPLATE_DEEPSEEK_3},
    // ... 40+ templates
};

int32_t llm_chat_apply_template(
    const llm_chat_template tmpl,
    const std::vector<llama_chat_message> & msgs,
    std::string & dest, bool add_ass);

Import

#include "llama-chat.h"

I/O Contract

Inputs

Name	Type	Required	Description
tmpl	llm_chat_template	Yes	Detected or specified chat template enum
msgs	std::vector<llama_chat_message>	Yes	Array of chat messages with role and content
add_ass	bool	Yes	Whether to add assistant turn prefix at the end

Outputs

Name	Type	Description
dest	std::string	Formatted prompt string according to the template
result	int32_t	Length of the formatted string, or negative on error

Usage Examples

#include "llama-chat.h"

// Detect template from model metadata
std::string tmpl_str = llama_model_chat_template(model, nullptr);
auto tmpl = llm_chat_detect_template(tmpl_str);

// Format messages
std::vector<llama_chat_message> msgs = {
    {"system", "You are a helpful assistant."},
    {"user", "Hello!"},
};

std::string formatted;
llm_chat_apply_template(tmpl, msgs, formatted, true);
// Result depends on template, e.g. for ChatML:
// <|im_start|>system\nYou are a helpful assistant.<|im_end|>\n
// <|im_start|>user\nHello!<|im_end|>\n
// <|im_start|>assistant\n

Related Pages

Principle:Ollama_Ollama_Chat_Template_System

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment