Implementation:Mit han lab Llm awq LLaVA Conversation
| Knowledge Sources | |
|---|---|
| Domains | NLP, Prompt_Engineering |
| Last Updated | 2026-02-15 00:00 GMT |
Overview
The Conversation dataclass and associated templates manage prompt construction and formatting for various LLaVA and VILA model architectures, handling system prompts, role separators, image tokens, and multi-turn message history.
Description
This module defines the SeparatorStyle enum with five separator strategies: SINGLE (one separator for all turns), TWO (alternating separators for user and assistant turns), MPT (MPT-style im_start/im_end delimiters), PLAIN (no role prefixes), and LLAMA_2 (Llama-2 instruction wrapping with <<SYS>> and [INST] tags). The Conversation dataclass holds the complete state of a conversation including the system prompt, role names, message history, separator style, separator strings, version identifier, and flags for skip_next and image_loaded state. Its get_prompt() method assembles the full prompt string according to the configured separator style, correctly handling multimodal messages stored as tuples of (text, image, image_process_mode). The append_message() method adds new messages. The get_images() method extracts and preprocesses images from the message history, supporting Pad, Resize, Crop, and Default processing modes, and returns either PIL Image objects or base64-encoded strings. The to_gradio_chatbot() method converts the message history into the format expected by Gradio's chatbot component, skipping image-only messages. The module provides pre-configured conversation templates including conv_vicuna_v0, conv_vicuna_v1, conv_llama_2, conv_llava_llama_2, conv_mpt, conv_llava_plain, conv_llava_v0, conv_llava_v1, and their mmtag variants, all registered in the conv_templates dictionary. The get_conversation(version) function retrieves and copies a template by version key, defaulting to conv_vicuna_v1.
Usage
Use this module to construct correctly-formatted prompts for LLaVA and VILA models. Import the Conversation class to manage multi-turn dialogue state, or use get_conversation() to obtain a pre-configured template by version name.
Code Reference
Source Location
- Repository: Mit_han_lab_Llm_awq
- File: tinychat/serve/llava_conv.py
- Lines: 1-455
Signature
class SeparatorStyle(Enum):
SINGLE = auto()
TWO = auto()
MPT = auto()
PLAIN = auto()
LLAMA_2 = auto()
@dataclasses.dataclass
class Conversation:
system: str
roles: List[str]
messages: List[List[str]]
offset: int
sep_style: SeparatorStyle = SeparatorStyle.SINGLE
sep: str = "###"
sep2: str = None
version: str = "Unknown"
skip_next: bool = False
image_loaded: bool = False
def get_prompt(self) -> str: ...
def append_message(self, role, message) -> None: ...
def get_images(self, return_pil: bool = False) -> List: ...
def to_gradio_chatbot(self) -> List: ...
def copy(self) -> "Conversation": ...
def dict(self) -> dict: ...
def get_conversation(version: str) -> Conversation: ...
Import
from tinychat.serve.llava_conv import Conversation, SeparatorStyle, get_conversation, conv_templates
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| system | str | Yes | System prompt prepended to the conversation |
| roles | List[str] | Yes | Two-element list of role names, e.g. ["USER", "ASSISTANT"] |
| messages | List[List[str]] | Yes | List of [role, message] pairs; message may be a (text, image, mode) tuple for multimodal inputs |
| offset | int | Yes | Number of initial messages to skip when extracting images or converting to chatbot format |
| sep_style | SeparatorStyle | No | Separator formatting strategy (default: SINGLE) |
| sep | str | No | Primary separator string (default: "###") |
| sep2 | str | No | Secondary separator for TWO/LLAMA_2/PLAIN styles |
| version | str | Yes (for get_conversation) | Template key, e.g. "default", "llama_2", "mpt", "no-sys" |
| return_pil | bool | No | If True, get_images() returns PIL Image objects instead of base64 strings |
Outputs
| Name | Type | Description |
|---|---|---|
| prompt | str | Fully formatted prompt string ready for model input |
| images | List[PIL.Image or str] | Extracted and preprocessed images from the conversation |
| chatbot_data | List[List[str]] | Message pairs formatted for Gradio chatbot display |
| conversation_copy | Conversation | Deep copy of the conversation state |
Usage Examples
Creating a Conversation from a Template
from tinychat.serve.llava_conv import get_conversation
conv = get_conversation("default") # Returns a copy of conv_vicuna_v1
conv.append_message(conv.roles[0], "What is in this image?\n<image>")
conv.append_message(conv.roles[1], None)
prompt = conv.get_prompt()
print(prompt)
# "A chat between a curious user and an artificial intelligence assistant. ..."
# "USER: What is in this image?\n<image> ASSISTANT:"
Using Llama-2 Style Prompting
from tinychat.serve.llava_conv import get_conversation
conv = get_conversation("llama_2")
conv.append_message(conv.roles[0], "Describe this scene.")
conv.append_message(conv.roles[1], None)
prompt = conv.get_prompt()
# Uses <<SYS>> and [INST] wrapping for Llama-2 format
Listing All Available Templates
from tinychat.serve.llava_conv import conv_templates
for name in conv_templates:
print(name)
# no-sys, default, v0, v1, vicuna_v1, llama_2, plain, v0_plain,
# llava_v0, v0_mmtag, llava_v1, v1_mmtag, llava_llama_2, mpt