Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Turboderp org Exllamav2 PromptFormat Interface

From Leeroopedia
Knowledge Sources
Domains NLP, Prompt_Engineering, Chat
Last Updated 2026-02-15 00:00 GMT

Overview

Concrete pattern for implementing model-specific prompt formatting through a base class and 18+ subclasses, provided as example code by exllamav2.

Description

PromptFormat is a base class defined in the exllamav2 examples that provides a standard interface for model-specific prompt templating. Each model family has a subclass that implements the correct token structure for that model's chat/instruction format.

The base class defines five methods that subclasses must implement:

  • default_system_prompt(): Returns the model's default system instruction text.
  • first_prompt(system_prompt, user_message): Formats the first conversation turn, typically embedding the system prompt.
  • subs_prompt(user_message): Formats subsequent conversation turns.
  • stop_conditions(tokenizer): Returns a list of token IDs and/or strings that signal the end of the assistant's response.
  • encoding_options(): Returns a dictionary of encoding flags (add_bos, encode_special_tokens, etc.).

Available subclasses include:

Subclass Models Format Style
PromptFormat_raw Any No formatting, raw text
PromptFormat_llama Llama 2 [INST] / <<SYS>> markers
PromptFormat_llama3 Llama 3 Header ID tokens
PromptFormat_codellama Code Llama Code-specific [INST] variant
PromptFormat_chatml Qwen, Yi, etc. im_start|>/<|im_end|>
PromptFormat_mistral Mistral [INST] without <<SYS>>
PromptFormat_phi3 Phi-3 system|>/<|user|>/<|assistant|>
PromptFormat_gemma Gemma <start_of_turn>/<end_of_turn>
PromptFormat_deepseek DeepSeek DeepSeek-specific tokens
PromptFormat_cohere Command R Cohere chat format

A prompt_formats dictionary maps format names to their classes for easy lookup.

Note: This is example code, not an installable API. Copy the pattern into your own project or import directly from the examples directory.

Usage

Use PromptFormat subclasses when building chat applications:

  • Select the appropriate subclass for your model
  • Use first_prompt() for the initial message
  • Use subs_prompt() for follow-up messages
  • Use stop_conditions() to configure generation termination
  • Use encoding_options() to set tokenizer flags

Code Reference

Source Location

  • Repository: exllamav2
  • File: examples/chat_prompts.py
  • Lines: L2-31 (base class), L721-741 (prompt_formats dict)

Base Class Signature

class PromptFormat:

    botname: str = "Assistant"
    username: str = "User"
    description: str = "Undefined prompt format"

    def default_system_prompt(self) -> str:
        ...

    def first_prompt(self, system_prompt: str, user_message: str) -> str:
        ...

    def subs_prompt(self, user_message: str) -> str:
        ...

    def stop_conditions(self, tokenizer) -> list:
        ...

    def encoding_options(self) -> dict:
        ...

ChatML Subclass Example

class PromptFormat_chatml(PromptFormat):

    description = "ChatML"

    def default_system_prompt(self):
        return "You are a helpful assistant."

    def first_prompt(self, system_prompt, user_message):
        return (
            f"<|im_start|>system\n{system_prompt}<|im_end|>\n"
            f"<|im_start|>user\n{user_message}<|im_end|>\n"
            f"<|im_start|>assistant\n"
        )

    def subs_prompt(self, user_message):
        return (
            f"<|im_end|>\n"
            f"<|im_start|>user\n{user_message}<|im_end|>\n"
            f"<|im_start|>assistant\n"
        )

    def stop_conditions(self, tokenizer):
        return [tokenizer.eos_token_id, "<|im_end|>"]

    def encoding_options(self):
        return {"add_bos": False, "encode_special_tokens": True}

Import

# From the examples directory (not a pip-installable module):
from examples.chat_prompts import PromptFormat, prompt_formats

# Or copy the pattern into your own code

I/O Contract

first_prompt Inputs

Name Type Required Description
system_prompt str Yes System instruction text to embed in the first turn
user_message str Yes The user's first message

first_prompt Output

Name Type Description
formatted_prompt str Complete formatted prompt string with system prompt and first user message, ready for tokenization

subs_prompt Inputs

Name Type Required Description
user_message str Yes The user's follow-up message

subs_prompt Output

Name Type Description
formatted_prompt str Formatted follow-up turn string to append to the conversation

stop_conditions Inputs

Name Type Required Description
tokenizer ExLlamaV2Tokenizer Yes Tokenizer instance for resolving token IDs

stop_conditions Output

Name Type Description
conditions list List of token IDs (int) and/or stop strings (str) that terminate generation

Usage Examples

Using a Prompt Format for Chat

from examples.chat_prompts import prompt_formats

# Select format by name
fmt = prompt_formats["chatml"]()

# Format first turn
prompt = fmt.first_prompt(
    system_prompt="You are a helpful coding assistant.",
    user_message="Write a Python hello world program.",
)
# Result: "<|im_start|>system\nYou are a helpful coding assistant.<|im_end|>\n
#          <|im_start|>user\nWrite a Python hello world program.<|im_end|>\n
#          <|im_start|>assistant\n"

# Get encoding options
opts = fmt.encoding_options()
# Result: {"add_bos": False, "encode_special_tokens": True}

# Get stop conditions
stops = fmt.stop_conditions(tokenizer)
# Result: [eos_token_id, "<|im_end|>"]

Multi-Turn Conversation

fmt = prompt_formats["llama3"]()

# First turn
conversation = fmt.first_prompt(
    system_prompt=fmt.default_system_prompt(),
    user_message="What is Python?",
)

# Generate response...
response = "Python is a high-level programming language..."

# Append response and format next turn
conversation += response
conversation += fmt.subs_prompt("Can you show me an example?")

# Generate next response...

Listing Available Formats

from examples.chat_prompts import prompt_formats

for name, fmt_class in prompt_formats.items():
    fmt = fmt_class()
    print(f"{name}: {fmt.description}")

Related Pages

Implements Principle

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment