Principle:OpenGVLab InternVL Conversation Template System

Knowledge Sources	OpenGVLab_InternVL
Domains	Prompt Engineering, Conversation Management, Multi-turn Dialogue
Last Updated	2026-02-07 14:00 GMT

Overview

The conversation template abstraction that manages prompt formatting across different LLM backends, ensuring each model receives input in its expected chat format including system prompts, role markers, separator tokens, and stop criteria.

Description

Different language models are trained with distinct conversation formats (also called chat templates). A Vicuna model expects "USER: ... ASSISTANT: ..." format, while a LLaMA-2 model uses "[INST] ... [/INST]" wrapping, and an MPT model uses "<|im_start|>/<|im_end|>" markers. Generating with the wrong template produces degraded outputs because the model was not trained to parse that format.

The Conversation Template System solves this by abstracting the formatting logic into a separator style enum and a conversation dataclass. Each template defines:

System prompt: The initial instruction context
Role names: How user and assistant turns are labeled
Separator style: The algorithm for joining messages (SINGLE separator, TWO alternating separators, MPT im_start/im_end, PLAIN no-role, LLAMA_2 instruction wrapping, CHATINTERN, INTERNVL_ZH)
Separator tokens: The actual delimiter strings
Stop criteria: Token IDs or strings that signal generation should halt

The template system also handles multimodal inputs by managing image placeholder tokens within the conversation and supporting multiple image processing modes (pad, resize, crop) for UI rendering.

Templates are registered in a global dictionary by name, allowing the rest of the codebase to look up the correct template for any given model configuration.

Usage

Apply this principle when formatting prompts for any supported LLM backend. Always look up the correct template by model name, create a copy (to avoid modifying the shared template), append user and assistant messages, then call the prompt generation method to produce the correctly formatted string for tokenization.

Theoretical Basis

The conversation template pattern reflects the broader principle that instruction-tuned language models are sensitive to exact prompt formatting. This design follows the template management approach established by LLaVA and Vicuna projects, where each fine-tuned model variant requires its specific chat format to function correctly. The pattern is now standard in multi-model frameworks like HuggingFace's chat templates and vLLM's conversation handling.

Related Pages

Implementation:OpenGVLab_InternVL_LLaVA_Conversation

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment