Principle:OpenGVLab InternVL Conversation Template System
| Knowledge Sources | |
|---|---|
| Domains | Prompt Engineering, Conversation Management, Multi-turn Dialogue |
| Last Updated | 2026-02-07 14:00 GMT |
Overview
The conversation template abstraction that manages prompt formatting across different LLM backends, ensuring each model receives input in its expected chat format including system prompts, role markers, separator tokens, and stop criteria.
Description
Different language models are trained with distinct conversation formats (also called chat templates). A Vicuna model expects "USER: ... ASSISTANT: ..." format, while a LLaMA-2 model uses "[INST] ... [/INST]" wrapping, and an MPT model uses "<|im_start|>/<|im_end|>" markers. Generating with the wrong template produces degraded outputs because the model was not trained to parse that format.
The Conversation Template System solves this by abstracting the formatting logic into a separator style enum and a conversation dataclass. Each template defines:
- System prompt: The initial instruction context
- Role names: How user and assistant turns are labeled
- Separator style: The algorithm for joining messages (SINGLE separator, TWO alternating separators, MPT im_start/im_end, PLAIN no-role, LLAMA_2 instruction wrapping, CHATINTERN, INTERNVL_ZH)
- Separator tokens: The actual delimiter strings
- Stop criteria: Token IDs or strings that signal generation should halt
The template system also handles multimodal inputs by managing image placeholder tokens within the conversation and supporting multiple image processing modes (pad, resize, crop) for UI rendering.
Templates are registered in a global dictionary by name, allowing the rest of the codebase to look up the correct template for any given model configuration.
Usage
Apply this principle when formatting prompts for any supported LLM backend. Always look up the correct template by model name, create a copy (to avoid modifying the shared template), append user and assistant messages, then call the prompt generation method to produce the correctly formatted string for tokenization.
Theoretical Basis
The conversation template pattern reflects the broader principle that instruction-tuned language models are sensitive to exact prompt formatting. This design follows the template management approach established by LLaVA and Vicuna projects, where each fine-tuned model variant requires its specific chat format to function correctly. The pattern is now standard in multi-model frameworks like HuggingFace's chat templates and vLLM's conversation handling.