Implementation:Mit han lab Llm awq LLaVA Conversation

Knowledge Sources	Mit_han_lab_Llm_awq
Domains	NLP, Prompt_Engineering
Last Updated	2026-02-15 00:00 GMT

Overview

The Conversation dataclass and associated templates manage prompt construction and formatting for various LLaVA and VILA model architectures, handling system prompts, role separators, image tokens, and multi-turn message history.

Description

This module defines the SeparatorStyle enum with five separator strategies: SINGLE (one separator for all turns), TWO (alternating separators for user and assistant turns), MPT (MPT-style im_start/im_end delimiters), PLAIN (no role prefixes), and LLAMA_2 (Llama-2 instruction wrapping with <<SYS>> and [INST] tags). The Conversation dataclass holds the complete state of a conversation including the system prompt, role names, message history, separator style, separator strings, version identifier, and flags for skip_next and image_loaded state. Its get_prompt() method assembles the full prompt string according to the configured separator style, correctly handling multimodal messages stored as tuples of (text, image, image_process_mode). The append_message() method adds new messages. The get_images() method extracts and preprocesses images from the message history, supporting Pad, Resize, Crop, and Default processing modes, and returns either PIL Image objects or base64-encoded strings. The to_gradio_chatbot() method converts the message history into the format expected by Gradio's chatbot component, skipping image-only messages. The module provides pre-configured conversation templates including conv_vicuna_v0, conv_vicuna_v1, conv_llama_2, conv_llava_llama_2, conv_mpt, conv_llava_plain, conv_llava_v0, conv_llava_v1, and their mmtag variants, all registered in the conv_templates dictionary. The get_conversation(version) function retrieves and copies a template by version key, defaulting to conv_vicuna_v1.

Usage

Use this module to construct correctly-formatted prompts for LLaVA and VILA models. Import the Conversation class to manage multi-turn dialogue state, or use get_conversation() to obtain a pre-configured template by version name.

Code Reference

Source Location

Repository: Mit_han_lab_Llm_awq
File: tinychat/serve/llava_conv.py
Lines: 1-455

Signature

class SeparatorStyle(Enum):
    SINGLE = auto()
    TWO = auto()
    MPT = auto()
    PLAIN = auto()
    LLAMA_2 = auto()

@dataclasses.dataclass
class Conversation:
    system: str
    roles: List[str]
    messages: List[List[str]]
    offset: int
    sep_style: SeparatorStyle = SeparatorStyle.SINGLE
    sep: str = "###"
    sep2: str = None
    version: str = "Unknown"
    skip_next: bool = False
    image_loaded: bool = False

    def get_prompt(self) -> str: ...
    def append_message(self, role, message) -> None: ...
    def get_images(self, return_pil: bool = False) -> List: ...
    def to_gradio_chatbot(self) -> List: ...
    def copy(self) -> "Conversation": ...
    def dict(self) -> dict: ...

def get_conversation(version: str) -> Conversation: ...

Import

from tinychat.serve.llava_conv import Conversation, SeparatorStyle, get_conversation, conv_templates

I/O Contract

Inputs

Name	Type	Required	Description
system	str	Yes	System prompt prepended to the conversation
roles	List[str]	Yes	Two-element list of role names, e.g. ["USER", "ASSISTANT"]
messages	List[List[str]]	Yes	List of [role, message] pairs; message may be a (text, image, mode) tuple for multimodal inputs
offset	int	Yes	Number of initial messages to skip when extracting images or converting to chatbot format
sep_style	SeparatorStyle	No	Separator formatting strategy (default: SINGLE)
sep	str	No	Primary separator string (default: "###")
sep2	str	No	Secondary separator for TWO/LLAMA_2/PLAIN styles
version	str	Yes (for get_conversation)	Template key, e.g. "default", "llama_2", "mpt", "no-sys"
return_pil	bool	No	If True, get_images() returns PIL Image objects instead of base64 strings

Outputs

Name	Type	Description
prompt	str	Fully formatted prompt string ready for model input
images	List[PIL.Image or str]	Extracted and preprocessed images from the conversation
chatbot_data	List[List[str]]	Message pairs formatted for Gradio chatbot display
conversation_copy	Conversation	Deep copy of the conversation state

Usage Examples

Creating a Conversation from a Template

from tinychat.serve.llava_conv import get_conversation

conv = get_conversation("default")  # Returns a copy of conv_vicuna_v1
conv.append_message(conv.roles[0], "What is in this image?\n<image>")
conv.append_message(conv.roles[1], None)
prompt = conv.get_prompt()
print(prompt)
# "A chat between a curious user and an artificial intelligence assistant. ..."
# "USER: What is in this image?\n<image> ASSISTANT:"

Using Llama-2 Style Prompting

from tinychat.serve.llava_conv import get_conversation

conv = get_conversation("llama_2")
conv.append_message(conv.roles[0], "Describe this scene.")
conv.append_message(conv.roles[1], None)
prompt = conv.get_prompt()
# Uses <<SYS>> and [INST] wrapping for Llama-2 format

Listing All Available Templates

from tinychat.serve.llava_conv import conv_templates

for name in conv_templates:
    print(name)
# no-sys, default, v0, v1, vicuna_v1, llama_2, plain, v0_plain,
# llava_v0, v0_mmtag, llava_v1, v1_mmtag, llava_llama_2, mpt

Related Pages

Principle:Mit_han_lab_Llm_awq_Conversation_Template_Management

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment